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Abstract 

This  research  addresses  methods  for  exploiting  the  spatio-temporal  joint  likelihood 
of  observed  kinematic  and  nonkinematic  (sensor  signature)  physical  events  to  improve 
dynamic  object  and  target  recognition.  A  principal  direction  is  the  application  of  dynamic 
programming  sequence  comparison  techniques  to  condition  matching  of  object  signatures 
to  known  models  according  to  observed  kinematics  -  that  is,  to  use  information  from 
observed  kinematics  in  determining  allowable  aspect  angles  with  which  observed  signatures 
may  be  matched  on  models  for  candidate  objects.  A  second  direction  is  the  application  of 
kinematic/aspect-angle  Kalman  filter  trackers  to  condition  kinematic  tracking  according  to 
observed  signatures.  These  conditioning  processes  dramatically  reduce  ambiguity  in  object 
recognition,  and  can  be  used  together  or  separately  to  allow  computation  of  a  posteriori 
probabilities  of  object  class  membership  using  Bayesian  methods.  Proposals  are  supported 
by  results  of  simulated  target  tracking  and  high  range  resolution  radar  signature  analysis. 
The  original  contributions  of  this  effort  include:  (1)  new  approaches  for  and  theoretical 
understanding  of  syntactic  methods  in  multisensor  fusion  and  dynamic  object  recognition; 
(2)  extension  of  estimation  and  tracking  techniques  to  allow  object  recognition;  and  (3) 
introduction  of  a  new  performance  evaluation  technique  and  approach  for  establishing 
performance  bounds  in  dynamic  object  and  target  recognition. 
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APPLICATION  OF  SEQUENCE  COMPARISON  METHODS 
TO  MULTISENSOR  DATA  FUSION 
AND  TARGET  RECOGNITION 

I.  Research  Objectives 

1.1  Introduction 

The  last  ten  years  have  seen  vast  effort  applied  to  the  problem  of  fusing  information 
from  multiple  sensors  for  accurate  object  tracking  and,  ideally,  automatic  object  recognition 
(AOR)  [8,  28,  40,  122,  218] L  Acceptable  solutions  exist  for  many  of  these  and  other 
problems  posed  under  the  heading  of  multisensor  data  fusion ,  but  practical  automatic 
object  recognition  is  not  yet  a  reality.  The  fundamental  shortcomings  in  this  area  are 
reflected  by  the  variety  of  efforts  underway  to  provide  better  results  [44,  97,  124, 126,  224]. 
One  aspect  of  sensor  fusion  which  has  received  comparatively  little  attention  is  the  fusion  of 
information  on  object  motion  with  information  on  other  observable  object  characteristics  - 
feature  observables  or  sensor  signatures  [120,  121,  4,  122,  209]. 

The  intent  of  this  research  has  been  to  define  new  approaches  for  fusing  sensor  in¬ 
formation  to  improve  recognition  of  dynamic  objects  in  general  and  tactical  targets  in 
particular.  The  common  link  between  the  resulting  approaches  is  their  exploitation  of  the 
characteristic  relationships  between  observable  motion  and  sensor  signatures  for  typical  ob¬ 
ject  classes  and  behaviors  of  interest.  Inherently,  these  relationships  are  forced  by  physics, 
and  involve  the  concept  of  joint  likelihood  of  object  motion  and  other  observables  -  that 
is,  when  the  correct  object  model  is  associated  with  measurements  from  an  unclassified 
object,  the  development  of  measurements  and  object  state  variable  estimates  over  time 
should  tend  to  be  consistent  in  all  observable  domains.  An  incorrect  object  class  associa¬ 
tion,  on  the  other  hand,  is  more  likely  to  betray  itself  through  some  inconsistency  betwee  i 
the  expected  and  observed  motion  and/or  object  signatures. 

‘Note:  throughout  this  document,  citation  listings  appearing  out  of  numerical  order  indicate  that  the 
citations  are  listed  in  order  of  decreasing  relevance. 
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Early  in  this  research,  it  was  observed  that  many  of  the  problems  inherent  in  melding 
information  from  position/motion  sensors  and  feature  sensors  could  perhaps  be  addressed 
using  techniques  historically  applied  in  speech  processing  and  related  sequence  comparison 
efforts  [176,  193,  182,  195].  Another  observation  behind  this  effort  is  that  previously 
proposed  target  tracking  algorithms  [4,  120,  121,  122,  209]  would  fail  in  obvious  ways 
when  incorrect  assumptions  about  target  class  were  made,  and  that  a  target  recognition 
algorithm  could  use  that  failure  as  evidence  of  an  incorrect  choice  of  target  class. 

Ultimately  driven  by  these  insights,  the  original  contributions  of  this  research  are  the 
following: 

(1)  The  extension  of  previous  work  in  multisensor  target  tracking  [4,  120,  121,  122,  209]  to 
allow  object  recognition  through  the  method  of  multiple  model  estimator  residual  analysis. 

(2)  The  application  of  classical  sequence  comparison  techniques,  as  used  in  speech  process¬ 
ing,  chromosome  comparison,  and  other  areas,  to  multisensor  fusion  for  dynamic  object 
recognition. 

(3)  The  association  of  (a)  dynamic  programming-based  state  estimation  techniques  with 
(b)  classical  sequence  comparison  techniques,  and  application  of  these  state  estimation 
techniques  to  multisensor  fusion  for  dynamic  object  recognition. 

(4)  The  joining  of  (1),  (2),  and  (3)  to  create  a  new  estimator  structure  exploiting  joint 
likelihood  of  object  motion  and  sensor  signature  measurements  for  object  recognition. 

(5)  Contributions  to  the  theory  of  dynamic  object  recognition  as  a  problem  in  syntactic 
pattern  recognition  and  joint  likelihood  of  all  observed  events. 

(6)  The  application  of  classical  Bayesian  parameter  estimation  and  generalized  ambiguity 
function  techniques  for  multisensor  object  recognition. 

These  contributions  provide  new  approaches  for  combining  information  to  make  ob¬ 
ject  recognition  decisions,  and  new  or  fresh  understanding  of  previous  efforts  by  other 
researchers.  In  most  cases,  the  information  to  be  fused  is  already  available  from  common 
state-of-the-art  sensors  and  can  be  integrated  using  techniques  to  be  shown  herein  with 
the  addition  of  computational  power  on'y  -  no  new  sensors  are  required.  In  particular, 
unlike  neural  nets,  hidden  Markov  models,  and  other  currently  popular  information  fu- 
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sion  algorithms  [169,  81,  174,  175,  68],  the  approaches  discussed  here  do  not  establish 
decision-making  parameters  through  training  processes  over  which  the  user  has  no  direct 
visibility  -  Jill  decisions  are  easily  traced  and  accountable  according  to  Bayesian  theory, 
dynamic  programming,  and  classic  parameter  estimation  methods.  Moreover,  all  state 
variables  employed  in  this  effort  correspond  to  well-understood  physical  processes  or  sta¬ 
tistical  representations  for  those  processes. 

This  dissertation  is  organized  into  seven  chapters  and  three  supporting  appendices. 
This  first  chapter  outlines  and  justifies  the  research.  Chapter  II  discusses  the  state  of  the 
art  in  pattern  recognition,  target  tracking,  and  sensor  fusion.  Chapter  III  exploits  the 
material  of  Chapter  II  to  define  a  new  class  of  object  recognition  algorithms.  Chapter  IV 
discusses  one  element  in  this  class,  the  extension  of  kinematic/aspect-angle  tracking  fil¬ 
ters  and  residual/state  monitoring  to  multisensor  object  recognition.  Chapter  V  discusses 
another  element  in  this  class,  the  application  of  dynamic  programming-based  sequence 
comparison  techniques  to  multisensor  object  recognition.  Chapter  VI  recommends  ex¬ 
tensions  to  this  effort  to  explore  issues  and  options  not  addressed  here.  The  concluding 
Chapter  VII  reviews  the  major  points  of  this  effort.  Appendix  A  gives  a  glossary  of  key 
terms.  Appendix  B  provides  fundamental  background  material  in  pattern  recognition.  Ap¬ 
pendix  C  lists  particular  equations  and  provides  results  which  are  considered  too  involved 
for  the  body  of  the  text  or  are  not  part  of  the  original  effort  of  this  research. 

1.2  Research  Overview  and  Justification 

Contemporary  researchers  generally  classify  the  information  available  from  sensors 
into  two  broad  categories  -  kinematic  (or  spatial)  and  nonkinematic  (or  nonspatial)  [8:297- 
302]  [218:165].  Kinematic  or  motion-related  information  is  characteristically  limited  to 
measurements  of  object  centroid  position  and  velocity.  Nonkinematic  information  is  a 
much  more  diverse  categorization,  classically  including  “feature  attributes”  such  as  radar 
signature,  optically-derived  shape  descriptors,  the  presence  or  absence  of  certain  forms  of 
electromagnetic  emissions,  and  so  on.  Nonkinematic  information  is  often  a  direct  function 
of  the  object  or  target  aspect  angle  relative  to  the  sensor,  reflecting  the  “appearance”  of 
that  object  to  the  sensor  in  the  appropriate  spectral  or  algorithmic  sense.  As  we  shall 
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see,  the  dependence  of  aspect  angle  on  kinematics  makes  the  term  “nonkinematic  informa¬ 
tion”  something  of  a  misnomer  -  the  author  chooses  to  refer  to  these  quantities  as  feature 
observables  or  smsor  signatures,  as  noted  in  the  chapter  introduction. 

In  geneial,  schemes  for  fusing  kinematic  information  involve  versions  of  the  Kalman 
filter/state  estimator,  in  which  measured  values  Eire  compared  to  mathematical  models  of 
object  behavior  to  derive  kinematic  estimates  that  Eire  optimEil  in  some  sense.  This  field  is 
well- developed  and  diverse,  with  a  rich  body  of  supporting  literature  developed  over  the 
past  40  yesirs  [153,  154,  10]. 

In  contrast,  the  fusion  of  nonkinematic  information  for  recognition  and  tracking  is 
a  much  newer  field.  Msmy  approaches  to  this  problem  have  been  proposed  [218:213-261], 
but  the  most  populEir  alternatives  involve  only  two  basic  approaches.  First,  the  “decision 
function”  or  “nearest  neighbor”  approach  [212:13],  involves  some  form  of  distEince  measure 
between  a  vector  of  measured  VEilues  for  selected  features  derived  from  an  unclassified 
object  on  one  hEind,  Eind  either  (1)  sets  or  clusters  of  such  feature  values  derived  from 
a  priori  testing  of  known  object  classes  (for  object  recognition)  [28,  227],  or  (2)  clusters 
representing  current  tracks  in  a  trackfile  (for  observation-to-track  Eissignment)  [8:312]. 

The  second,  or  probabilistic/statisticsd  approach,  is  based  on  maximum  likelihood 
(or  classical  inference)  methods  [218:216-219]),  Bayesian  probability  [212]  [218:220-222], 
or  Dempster-Shafer  evidence  accruEil  [33:380-386].  Here  the  user  makes  judgments  regard¬ 
ing  the  probability  that  his  measurements  could  occur,  conditioned  on  the  presence  of  one 
member  of  a  set  of  certEiin  likely  object  types,  and  combines  these  judgments  in  the  appro¬ 
priate  frEimework  to  postulate  which  object  type  generated  the  observed  data  [14]  [8:199- 
209].  It  should  be  noted  that  the  decision  function  Eind  statistical  approaches  are  intimately 
related  in  both  theory  and  practice  [212,  92]. 

The  object  databases  against  which  nonkinematic  features  have  been  classicEilly  com¬ 
pared  have  been  just  that  -  tabular  data  arrays,  or  feature  space  mappings,  or  decision 
surfaces  in  a  feature  space  of  one  form  or  another.  Increasingly,  however,  reseEirchers  are 
turning  to  model-based  approaches,  in  which  the  feature  values  can  be  derived  on-line  in 
some  fashion  from  a  three-dimensional  representation  of  the  object  [169:111-139].  In  ei- 
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ther  case,  often  these  features  are  position-,  scale-,  and  (in-plane)  rotation- invariant  (PSRI) 
global  descriptors  or  similar  quantities,  generally  functions  of  the  entire  object  representa¬ 
tion  at  a  given  aspect  angle,  such  as  Hu  moments  and  Fourier  descriptors  [73,  111,  217] 
(discussed  in  App.  B). 

The  problem  with  this  taxonomy,  however,  is  that  it  is  misleading  to  impose  a  defini¬ 
tive  distinction  between  “kinematic”  and  “nonkinematic”  forms  of  information.  With  few 
exceptions,  nonkinematic  features  are  functions  of  the  object- sensor  aspect  angle  (and 
other  factors,  of  course,  like  weather).  But  this  aspect  angle  is  immediately  a  function  of 
both  object  and  sensor  (ownship)  kinematics.  An  extensive  review  of  the  literature  shows 
that  this  connection,  although  recognized  by  some  [4,  120,  121,  122,  209],  has  not  generally 
been  exploited.  It  is  apparent  to  the  author  and  others  that  sensor  fusion  efforts  do  not 
generally  use  information  about  motion  which  can  be  extracted  from  observed  features,  or 
information  about  features  which  can  be  extracted  from  motion  [9:177-178]  [30]. 

In  the  field  of  classical  pattern  recognition,  considerable  effort  has  been  made  to 
solve  the  problem  of  “recognition,  tracking,  and  pose  estimation  of  arbitrarily-shaped 
3-d  objects”  (where  the  term  “pose”  refers  to  the  object’s  aspect  angle  relative  to  the 
sensor)  [102,  105,  110,  179].  These  include  a  number  of  techniques  to  extract  shape  or 
structure  from  motion  [110:400-422]  [117,  116,  206].  Generally  these  techniques  depend  on 
(1)  making  correct  assignments  between  points  (or  loci  of  points)  on  the  observed  object 
and  corresponding  points  (or  loci)  on  a  model  representation  [69,  139],  or  (2)  matching  ob¬ 
served  values  of  some  global  descriptor  with  model-  or  database-derived  values  [47,  73,  111]. 
Techniques  in  the  first  category  usually  depend  on  high  quality  images  of  the  observed 
object,  and  are  very  sensitive  to  performance  degradation  due  to  the  quality  of  images 
available  from  typical  remote  imaging  sensors.  Techniques  in  the  second  category  are  sen¬ 
sitive  to  errors  in  segmentation  (separating  the  object  from  its  background),  occlusion  (the 
presence  of  other  objects  blocking  the  object  of  interest-to-sensor  line  of  sight),  and  so  on. 
There  are  as  yet  no  final  “best”  answers. 

It  seems  clear  that  additional  research  is  warranted  to  find  methods  for  fusing  data 
from  “kinematic”  and  “nonkinematic”  sensors.  Note  that  henceforth,  because  of  the  mis- 
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leading  connotation  of  the  term  “nonkinematic,”  we  will  refer  to  “nonkinematic”  quantities 
as  “feature  observables”  or  “sensor  signatures.” 

1.2.1  Object  Recognition  Scenarios  of  Interest.  Two  promising  military  applica¬ 
tions  of  kinematic-feature  observable  fusion  for  automatic  object  recognition  would  appear 
to  be:  (1)  fusion  of  infrared  image-derived  angle,  angle  rate,  and  feature  observable  mea¬ 
surements  with  laser  ranger  measurements,  and  (2)  fusion  of  high  range  resolution  (HRR) 
radar  “range  sweep”  waveform  returns  and  centroid  kinematic  data  from  the  same  radar 
system  (the  latter  an  example  of  multisensor  fusion  in  the  sense  that  the  different  items 
of  information  are  extracted  by  distinctly  different  sensors  within  the  radar  system).  Note 
the  relationship  between  true  and  estimated  kinematics  and  signatures  for  targets  in  these 
scenarios: 

Case  1:  Consider  the  process  of  tracking  a  main  battle  tank  of  unknown  class  in 
a  planar  turn.  Unlike  wheeled  vehicles,  conventional  tanks  (more  generally,  vehicles  pro¬ 
pelled  by  tracks,  rather  than  wheels)  do  not  turn  with  a  constant  or  even  continuous 
radius  of  curvature,  and  in  fact  their  motion  can  be  described  as  nearly  piecewise  lin¬ 
ear  [86].  Our  sampled-data  sensor  provides  a  sequence  of  signature  vectors  (e.g.,  classical 
global  descriptors  like  Hu  moments  and  Fourier  descriptors),  as  measured  at  discrete  times 
over  the  observation  period.  Concurrently,  from  conventional  range  (laser  ranger)  /  angle 
tracking  and  kinematic  state  estimation  alone,  for  any  feasible  target  class,  we  can  hypoth¬ 
esize  a  sequence  of  expected  signature  vectors.  Comparing  the  observed  sequence  to  the 
kinematically-estimated  sequence  for  the  correct  target  model  in  Fig.  1.1,  we  see  that  their 
differences  can  be  described  fundamentally  in  terms  of  expansions  and  contractions  of  one 
sequence  relative  to  the  other.  Is  it  possible  to  compare  Hum  two  sequences  such  that 
their  origin  target  classes  can  be  seen  to  be  identical,  despite  <  ush  ns  and  contractions? 
Will  a  sequence  comparison  process  reduce  the  likelihood  of  ....  nect  identifications,  in 
comparison  with  other  approaches? 

Case  2:  We  sense  a  moving  aircraft  with  a  high  range  resolution  (HRR)  radar, 
from  which  we  obtain  sampled-data  measurements  of  target  position,  doppler  velocity  (see 
definition  in  App.  A)  and  “range  sweeps”  -  measurements  of  radar  cross  section  in  relatively 
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higher  than  expected  turn  (angular  rotation) 
rates  at  these  times. 

Figure  1.1.  Tank  Target  -  True  vs.  Estimated  Trajectory  (Notional) 
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fine  (small  with  respect  to  typical  target  dimensions)  range  increments  or  “bins”  along  the 
target-sensor  axis.  Stochastic  phase  interactions  between  returns  from  scatterers  in  any 
given  range  bin  make  the  range  sweeps  quite  noisy,  causing  deviations  from  model  and  test 
predictions  and  limiting  our  ability  to  judge  the  class  of  the  target  from  any  one  or  perhaps 
several  returns.  Generally,  however,  distinctive  elements  (peaks)  of  the  range  sweeps  stand 
out  when  returns  are  averaged,  or  viewed  over  time  as  the  aspect  angle  changes,  producing 
a  distinctive  sequence  like  the  “sinogram”  noted  in  [81]  (for  simulation- derived  examples, 
the  reader  may  wish  to  look  ahead  to  Figs.  5.20,  5.21,  and  5.22).  As  in  Case  1,  the  radar 
position  and  velocity  measurements  allow  us  to  estimate  the  motion  of  the  target  centroid, 
from  which  we  can  estimate  the  target-sensor  aspect  angle  based  on  the  flight  control 
physics  required  to  achieve  that  motion.  Thus,  as  in  the  previous  case,  for  any  candidate 
target  class,  we  can  hypothesize  a  sequence  of  expected  signatures  for  comparison  with  the 
sequence  of  observed  signatures.  The  same  questions  apply  as  in  Case  1  -  the  object  of 
this  research  was  to  answer  those  questions. 

1.2.2  Discussion:  Sequences  and  Joint  Likelihood.  In  both  cases  above,  we  wish 
to  assign  an  object  (target)  to  one  class  from  a  set  of  classes  represented  by  models  which 
give  feature  observable  values  as  a  function  of  aspect  angle,  and  define  the  kinematic 
behavior  of  each  object  class.  From  our  knowledge  of  the  target  (and  ownship)  kinematics, 
and  the  dynamic  limitations  for  each  class,  we  can  estimate  the  target-sensor  aspect  angle 
as  the  target  executes  a  maneuver.  Now  suppose  that,  centered  on  the  origin  of  the 
classical  three-axis  target  body  coordinate  frame  (“pitch,  roll,  and  yaw”  axes,  using  air 
vehicle  terminology),  we  hypothesize  a  3-D  sphere  of  unit  radius,  fixed  with  respect  to  that 
body  frame. 

Fig.  1.2  illustrates  the  sphere  and  body  frame  axes  -  see  Fig.  5.23  for  the  classical 
relationship  between  body  frame  axes  and  structure  for  an  aircraft  target.  Throughout  this 
dissertation,  this  “ hypothetical  aspect  angle  sphere ”  will  be  a  key  framework  for  discussion. 

As  the  target-sensor  aspect  changes,  the  target-sensor  unit  vector  traces  an  aspect 
angle  path  or  aspect  angle  “track”  like  path  “A”  or  path  “B”  on  the  target  sphere  in  Fig.  1.2. 
At  any  point  on  the  target  sphere,  and  therefore  at  any  point  along  an  aspect  angle  path  on 
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r—  Aspect  Angle  Path  A 


Figure  1.2.  The  Hypothetical  Aspect  Angle  Sphere 

some  model,  we  can  predict  a  value  for  the  corresponding  feature  observables.  Recording 
these  predictions  at  discrete  points  in  time  along  the  kinematically-estimated  aspect  track, 
we  obtain  a  one- dimensional  string  or  sequence  of  discrete  values.  If  our  model  choice 
and  aspect  angle  estimates  are  correct,  and  all  other  relevant  factors  are  equal  between 
measurement  and  prediction,  this  predicted  sequence  should  correspond  exactly  to  the 
sequence  of  measured  values. 

Of  course,  for  a  variety  of  reasons,  the  predicted  and  measured  sequences  will  never 
correspond  exactly,  even  where  the  model  corresponds  exactly  to  the  observed  target.  For 
example,  variations  in  the  signature  generation  process,  atmospheric  transmission,  and 
sensor  processing  errors  will  induce  noise  in  our  measurements  that  cannot  be  predicted 
by  a  finite-dimensional  mathematical  model.  More  subtly,  the  deviation  of  true  target 
kinematics  from  model  assumptions  will  cause  the  true  aspect  angle  path  to  lie  off  of 
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the  kinematically-estimated  or  “nominal”  aspect  single  path.  This  could  happen  for  an 
aircraft  target,  for  example,  if  our  model  assumes  conventional  coordinated  turn  dynamics, 
but  the  target  flies  using  advanced  control-configured  dynamics  [112]  that  do  not  follow 
coordinated  turn  rules. 

We  may  note  also  that  for  some  choices  of  object  or  target,  feature  space  and  com¬ 
parison  metric  (between  observed  and  predicted  feature  values),  each  individual  element 
in  the  observed  feature  sequence  could  be  used  to  define  a  “maximum  likelihood”  pose  or 
aspect  angle  estimate.  If  this  pose  information  is  known  to  be  accurate  with  respect  to  the 
correct  object  or  target  class,  it  may  be  usable  with  the  target  kinematic  information  to 
improve  estimates  for  the  future  kinematics  and  aspect  angle  /  feature  sequence. 

In  either  case,  it  is  clear  that,  for  the  proper  association  of  an  unclassified  target  or 
other  dynamic  object  with  the  correct  model  from  a  set  of  known  classes,  we  expect  that  the 
joint  likelihood  of  the  observed  quantities  in  kinematic,  feature  observable,  and  aspect  angle 
domains  will  be  consistent  over  time  -  an  incorrect  association  will  be  less  likely  to  exhibit 
the  correct  combination  of  behavior.  We  now  consider  two  basic  approaches  for  exploiting 
this  expectation  of  high  joint  likelihood.  In  both  cases,  treating  the  observable  events  as 
sequences  over  time  will  be  critical,  since  sequences  of  observations  from  physical  objects 
inherently  contain  information  about  the  joint  likelihood  of  deriving  those  observations 
from  any  particular  class  of  objects. 

1.2.3  One  Approach.  One  approach  for  fusing  kinematic  and  feature  observ¬ 
able  data  is  suggested  by  the  speech  recognition  technique  called  “dynamic  time  warping” 
(DTW)  [176,  182,  193],  one  of  a  class  of  techniques  for  sequence  comparison  [195]  which 
employ  dynamic  programming  (DP)  [23,  71].  DTW  is  a  method  for  comparing  sequences 
or  strings  of  feature  vectors  or  functions  extracted  from  discretized  speech  against  a  “li¬ 
brary”  of  feature  vector  strings  corresponding  to  selected  words.  A  close  match  between 
an  observed  or  measured  vector  string  and  a  library  vector  string  establishes  the  presence 
of  that  word,  or  a  sound  sequence  “close”  in  some  sense  to  the  matched  library  word. 

The  closeness  of  two  vector  sequences  is  established  by  a  dynamic  programming- 
based  process,  which  “warps”  each  library  sequence  in  a  particular  fashion  to  make  it 
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resemble  the  observed  sequence  (or  vice-versa),  and  establishes  a  warping  path  distance 
or  “cost”  for  the  matching  process.  This  cost  is  often  simply  the  sum  of  the  individual 
costs,  or  dissimilarity  measures,  obtained  by  the  matching  of  one  or  more  elements  of  one 
sequence  to  one  or  more  elements  of  the  other  sequence.  The  library  vector  sequence  with 
the  minimum  warping  path  cost  “wins”  the  comparison  process.  Note  that  the  warping 
is  required  even  when  the  observed  vector  string  and  the  library  vector  string  correspond 
to  the  same  word,  due  to  variations  in  factors  like  speaker  pronunciation  and  background 
noise. 

The  applicability  of  a  DP  sequence  comparison  process  to  sensor  fusion  arises  from 
the  fact  that  sequences  develop  naturally  in  object  recognition  -  observations  seldom  occur 
only  once  in  a  given  scenario.  As  we  saw  in  the  previous  section,  one  often  measures  a 
sequence  of  feature  observable  vectors  or  functions  (e.g.,  “range  sweeps”  from  high  range 
resolution  radar)  as  the  object-sensor  aspect  angle  changes,  while  from  the  motion  of  the 
object,  one  can  estimate  the  object-sensor  aspect  angle  for  each  potential  object  class  as 
the  features  were  observed  [5,  77,  120,  121].  Given  the  aspect  angle  estimate  sequence  for 
each  class,  we  can  then  generate  a  sequence  of  library  vectors  or  functions  representing  the 
feature  observable  sequence  which  should  have  been  produced  by  each  model  class  while 
executing  the  observed  maneuvers. 

The  basic  similarity  of  these  vectors  or  functions  to  speech  data  is  immediately  ev¬ 
ident,  and  suggests  application  of  a  DP-based  matching  technique.  The  matching  will 
never  be  exact,  yet  we  may  be  able  to  achieve  an  association  between  observed  and  ex¬ 
pected  sequences  which  compensates  for  errors  (noise)  in  the  kinematic  estimate,  feature 
observable  measurements,  and  model,  to  give  a  better  model-object  association  than  by 
comparing  kinematics  or  features  in  isolation,  or  by  comparison  without  “warping”  (a  lin¬ 
ear  comparison).  A  particularly  attractive  aspect  of  the  proposed  fusion  scheme  is  that  it 
has  the  potential  to  work  for  any  object  with  (1)  feature  observables  that  can  be  expressed 
as  functions  of  aspect  angle  relative  to  the  object  body  frame,  and  (2)  dynamics  that  are 
(a)  restricted  by  orientation  and  (b)  can  be  modelled  for  state  estimation. 

Dynamic  programming  sequence  comparison-based  fusion  processes  will  henceforth 
be  referred  to  as  motion  warping  in  this  effort.  We  will  see,  however,  that  dynamic  pro- 
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gramming  algorithms  not  heretofore  associated  with  classical  sequence  “warping”  are  also 
eminently  applicable  for  our  purpose. 

1.2.4  Another  Approach.  A  number  of  techniques  have  been  proposed  which 
estimate  pose  (aspect)  directly  from  feature  observables  [102,  105,  208],  requiring  no  input 
of  kinematic  information.  These  estimates  of  pose  can  in  turn  form  the  basis  for  a  target 
tracker  using  kinematic  and  aspect  information,  as  developed  first  by  Kendrick,  Maybeck, 
and  Reid  [121]  and  extended  in  [120,  4,  209,  63].  However,  in  general  these  methods  require 
reasonably  good  a  priori  knowledge  of  the  target  class  -  if  kinematic/aspect  tracking  is 
attempted  with  an  incorrect  assumption  of  target  class,  the  tracking  filter  may  quickly 
fail,  due  to  irreconcilable  differences  between  (1)  what  the  filter  expects  to  observe  and  (2) 
what  it  actually  observes. 

For  our  purpose  of  object  recognition,  however,  we  will  exploit  this  tendency  for  track¬ 
ing  failure  with  improper  object  class  associations.  These  failures  will  be  identified  using 
the  classical  state  estimation  techniques  of  filter  residual  sequence  and  state- reasonableness 
monitoring  -  the  correct  association  (object  class)  is  presumed  to  be  the  one  that  does  not 
fail,  or  is  the  best  among  those  that  do  not  fail. 

1.2.5  Development  Plan.  Chapter  II  discusses  the  state  of  the  art  in  pattern 
recognition,  target  tracking  and  sensor  fusion.  Chapter  III  will  apply  Bayes’  Rule  to  de¬ 
velop  a  motion  fusion  framework  in  which  both  DP  sequence  comparison-based  and  kine¬ 
matic/aspect  tracker-based  motion  fusion  object  recognition  algorithms  can  be  placed. 
These  motion  fusion  methods  will  be  shown  to  be  forms  of  syntactic  pattern  recognition, 
as  opposed  to  the  decision  theoretic  or  heuristic  methods  of  pattern  recognition  [212]  more 
commonly  applied  in  tactical  target  recognition.  Feature  or  signature  observable  measure¬ 
ments,  kinematic  measurements,  and  o  priori  knowledge  of  object  dynamic  limitations  for 
known  classes  are  used  as  independent  sources  of  information,  and  the  joint  likelihood  of 
those  events  is  exploited  in  any  of  several  ways  to  form  generalized  likelihood  functions ,  the 
highest  value  of  which  will  indicate  the  correct  association  of  unknown  object  and  known 
object  class. 
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Chapters  IV  and  V  will  develop  aspects  of  the  ideas  in  Chapter  III  in  detail,  and  will 
show  simulation  results  to  support  theory.  Chapter  VI  will  link  ideas  from  Chapter  III 
with  developments  in  Chapters  IV  and  V  to  show  how  these  developments  can  be  exploited 
in  a  number  of  directions. 

The  output  of  this  fusion  process  over  severed  models  is  the  one  model  for  which  the 
information  from  all  sources  -  kinematic,  feature  observable,  and  a  priori  model  knowl¬ 
edge  -  is  most  consistent  (a  form  of  maximum,  likelihood  multisensor  fusion  for  object 
recognition).  Chapter  VI  will  discuss  applications  for  this  approach.  For  example,  we  will 
define  paths  by  which  to  attack  some  problems  in  tactical  target  recognition  now  considered 
to  be  unsolvable. 

We  will  also  show  that,  since  kinematic-feature  observable  fusion  algorithms  can  be 
considered  as  maximum  likelihood  estimators  or  generalized  likelihood  functions,  these  al¬ 
gorithms  are  suitable  for  analysis  using  the  classical  state/parameter  estimation  tool  of 
generalized  ambiguity  functions  [154,  198].  This  tool  will  allow  us  to  contrast  motion- 
feature  fusion  performance  using  different  feature  observables,  feature  comparison  metrics, 
and  object  kinematic  model  assumptions  against  the  performance  of  (1)  truly  optimal  (if 
unrealizable  in  practice,  using  truth  information  not  available  to  an  actual  sensor)  maxi¬ 
mum  likelihood  techniques  and  (2)  more  conventional  techniques.  Significantly,  evaluations 
with  generalized  ambiguity  functions  will  lead  naturally  to  a  form  of  Cramer-Rao  lower 
bound  [184,  155]  for  the  performance  of  object  recognition  algorithms. 

1.3  Conclusion 

In  Sect.  1.2.1,  we  proposed  the  existence  of  observed  and  expected  measurement 
sequences,  and  asked  two  questions:  (1)  Is  it  possible  to  compare  these  two  sequences  such 
that  their  origin  target  classes  can  be  seen  to  be  identical,  if  so,  despite  expansions  and 
contractions?  (2)  Will  a  sequence  comparison  process  reduce  the  likelihood  of  incorrect 
identifications,  in  comparison  with  other  approaches?  This  research  will  show  that  the 
answer  to  both  questions  is  yes. 
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The  contributions  of  this  research  are  (1)  to  fold  dynamic  time  warping  and  related 
sequence  comparison  techniques  as  maximum  likelihood  methods  into  the  structure  of  mul¬ 
tisensor  fusion,  (2)  to  extend  previous  efforts  in  target  tracking  so  as  to  provide  dynamic 
object  and  target  recognition,  (3)  to  provide  a  Bayesian  structure  for  understanding  fusion 
of  kinematic  and  feature  observable  information,  and  (4)  to  introduce  the  use  of  generalized 
ambiguity  functions  as  an  analysis  technique  for  gauging  the  effectiveness  of  these  and  other 
multisensor  fusion  methods.  These  developments  provide  new  theoretical  understanding 
for  object  recognition,  extend  and  unify  ■*  results  of  previous  reseachers,  and  provide 
paths  for  attacking  currently  unsolvable  problems. 

This  chapter  has  defined  the  inspiration  and  istification  for  “motion  warping”  and 
other  approaches  for  fusion  of  “kinematic”  and  “nonkinematic”  information.  In  turn,  it 
has  outlined  the  structure  within  which  this  dissertation  will  develop  and  demonstrate  the 
proposed  aproaches.  The  following  chapter  will  examine  related  concepts  in  pattern  recog¬ 
nition,  target  tracking,  multisensor  fusion,  state  and  parameter  estimation,  and  speech 
processing. 
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II.  Background 


2. 1  Introduction 

The  purpose  of  this  chapter  is  to  outline  the  five  key  disciplines  which  have  con¬ 
tributed  to,  or  form  departure  points  for,  the  ideas  embodied  in  this  dissertation.  These 
key  disciplines  are:  (1)  pattern  recognition  and  automatic  object  recognition,  (2)  object 
tracking  and  state  estimation,  (3)  dynamic  programming  sequence  comparison,  (4)  multi¬ 
sensor  fusion,  and  (5)  the  theory  of  generalized  ambiguity  functions.  The  field  of  multi¬ 
sensor  fusion  has  deep  roots  in  pattern  recognition  and  object  tracking,  and  has  seen  some 
application  of  dynamic  programming.  This  research  has  greatly  strengthened  the  utility 
of  dynamic  programming  in  multisensor  fusion  through  the  concept  of  “motion  warping” , 
and  uses  generalized  ambiguity  functions  to  quantify  this  utility. 

Much  of  the  following  discussion  is  oriented  toward  a  specific  class  of  dynamic  phys¬ 
ical  objects  -  specifically,  objects  that  may  be  identified  as  targets  in  military  operations 
or  other  remote  sensing  scenarios.  The  use  of  the  word  “target”  rather  than  “object” 
in  radar  sensing  is  a  historical  custom  -  however,  the  reader  should  keep  in  mind  that 
the  fundamental  issue  in  this  research  is  to  explore  and  exploit  for  recognition  the  gener¬ 
ally  characteristic  physical  relationships  between  object  behavior  and  object  appearance  - 
whatever  the  object,  sensor,  or  scenario. 

Chapter  III  will  develop  the  theory  for  a  class  of  object  recognition  algorithms  with 
this  objective,  and  Chapters  IV  and  V  will  show  results  from  representative  algorithms. 

2.2  Pattern  Recognition  /  Automatic  Object  Recognition 

This  section  provides  a  structure  for  relating  the  accomplished  research  to  elements 
in  the  vast  array  of  existing  concepts  for  pattern  recognition  and  non-cooperative  target 
recognition  (NCTR).  These  fields  have  grown  explosively  and  in  innumerable  directions 
in  the  past  ten  years,  creating  a  need  for  articles  and  texts  that  simply  define  taxonomy 
without  technical  detail  (28,  55,  89,  97,  218],  and  for  conferences  which  bring  together 
experts  in  different  yet  related  fields  who  have  worked  theretofore  in  isolation  from  one 
another  [42,  43,  44,  124,  126,  231]. 
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The  first  two  of  the  following  subsections  consider  in  turn  the  two  primary  classical 
pattern  recognition  approaches  -  decision  theoretic  and  syntactic.  Finally,  we  discuss  con¬ 
cepts  for  object  recognition  with  high  range  resolution  (HRR)  radar  that  form  a  departure 
point  and  comparison  baseline  for  the  original  research  described  here.  Additional  material 
on  pattern  recognition  of  a  generally  tutorial  nature  is  found  in  App.  B. 

2.2.1  Decision  Theoretic  Pattern  Recognition.  Decision  theoretic  [90]  or  “sta¬ 
tistical”  [92]  pattern  recognition  methods  begin  with  a  choice  of  observable  mathematical 
quantities  or  features  by  which  to  describe  objects  of  interest.  A  particular  set  or  vector 
of  features  associated  with  an  object  or  measurement  defines  a  point  in  a  feature  space. 

As  discussed  in  the  previous  chapter,  features  or  feature  observables  for  physical 
objects  Eire  generally  functions  of  aspect  angle.  For  an  ideally- chosen  feature  set,  the 
values  of  features  for  different  classes  and  discrete  orientations  (generally  a  given  number 
of  aspect  angle  “bins”)  will  cluster  separately  in  the  feature  space.  Cluster  dispersion  for 
any  particular  object  class/aspect  angle  will  occur  due  to  atmosphere  and  sensor  noise, 
minor  object  variations,  etc.  Well-chosen  features  provide  low  ambiguity  in  the  specific 
sense  of  that  term  used  in  [154:97]-  that  a  particular  combination  of  measured  feature 
values  tends  to  identify  a  particular  object  and  its  orientation  uniquely. 

A  particularly  desirable  form  of  decision  theoretic  classifier  is  one  which  yields  p(u> <  |  z) 
for  each  class  u>i  of  J  possible  object  classes  -  that  is,  the  probability  p  that  an  observed 
object  is  a  member  of  class  given  that  we  have  observed  some  sensor  measurement 
(vector)  value  z.  This  probability  is  commonly  obtained  through  Bayes’  Rule  [212,  197]: 

.  zx  P(z  |  o'i)p(a>i)  _  p(z  1  u>i)p(o>i) 

P(z )  E/=i  P(»  I  «j)p(wi) 

in  which  p  represents  a  probability  or  probability  density  function  (if  the  latter  exists), 
z  represents  an  observed  or  measurement  value,  and  u>i  is  the  z-th  object  class  (ignoring 
aspect  angle  variations  for  the  moment)  from  a  total  of  J  classes. 

Bayes’  rule  defines  the  Bayesian  classifier  -  a  theoretically  optimal  classifier  under 
certain  cc  uditions  [212:113].  Use  of  this  and  related  probabilistic  or  “parametric”  clas- 
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sifters  require  that  we  establish  the  probability  or  probability  density  p(z  |  u>j)  for  each 
combination  of  possible  measurement  z  and  known  object  class  wt.  This  is  the  process 
of  density  estimation,  a  form  of  “training”  generally  conducted  by  experiment  using  tech¬ 
niques  discussed  in  detail  in  [212:134-154].  Density  estimation  is  generally  a  nontrivial 
process  and  presents  the  major  obstacle  in  implementing  parametric  classifiers  [28]. 

For  some  actual  observed  z  from  an  unclassified  object,  the  derived  classical  likelihood 
p(z  |  u»<)  of  that  observation  from  each  class  provides  the  new  information  in  a  parametric 
classification  process.  Often,  p(z  |  u>j)  is  assumed  to  be  described  adequately  as  Gaussian. 
The  a  priori  probability  p(u/t)  is,  in  military  scenarios,  often  taken  from  order-of-battle 
information  which  estimates  the  relative  fractions  of  numbers  of  each  object  (target)  class 
expected  on  the  battlefield.  Under  the  (usually  weak)  assumption  that  we  have  this  data 
for  all  potential  objects,  the  rightmost  denominator  in  the  previous  equation  -  the  sum 
of  numerator  terms  for  all  object  classes  -  allows  us  to  define  the  a  posteriori  probability 
measure  p(wj  |  z)  for  a  given  observed  z.  The  i  for  which  p(u>t  |  z)  is  greatest  is  taken  to 
identify  the  object  class. 

In  the  absence  of  specific  a  priori  information  on  the  object  class  distributions,  we 
may  choose  to  assume  them  equal,  i.e.,  that  p(u>j)  is  the  same  for  each  ui,.  These  terms 
then  cancel  from  the  numerator  and  denominator  of  Eqn.  (2.1)  to  produce  the  often-seen 
form: 


P(z  I  «*)P( «*»•)  _  P(z  I  <*'*) 

*> =  K>)  "  ^ 

also  called  a  (conditional)  maximum  likelihood  classifier.  In  this  form,  the  a  posteriori 
probability  per  se  is  often  irrelevant,  since  the  selected  class  u>l  will  simply  be  the  one 
which  maximizes  p(z  |  w<)  -  the  classical  likelihood  function. 

A  further  modification  [10,  33]  to  this  technique  is  to  allow  for  generation  of  a  z  by 
some  source  other  than  a  known  object  class,  such  as  noise  and/or  the  set  of  all  unknown 
object  classes.  This  results  in  the  addition  of  an  equivalent  “J  +  1st”  object  class. 
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For  decision  theoretic  recognizers  which  record  feature  space  information  for  each 
object  as  a  function  of  aspect  angle  (generally  a  finite  number  of  aspect  angle  bins,  as 
noted  above),  the  “best”  feature  match  for  any  given  object  model  is  often  used  to  provide 
a  “best”  estimate  of  object  aspect  single  -  a  pose  estimate.  Note  that  the  pose  estimate  per 
se  is  an  angle  value,  and  provides  no  information  about  the  closeness  of  the  model-object 
match  in  a  feature  space  distance  sense.  A  sequence  of  pose  estimates,  or  pose  estimate 
history ,  is  the  result  of  matching  a  sequence  of  feature  observable  measurements  over  time 
to  a  given  model. 

Most  object  recognition  approaches  sire  “independent  look”  algorithms  -  that  is, 
they  do  not  consider  the  location  of  previous  pose  estimates  in  making  the  most  current 
estimate.  But,  for  many  physical  objects  (and  tactical  targets  in  particular),  transitions 
between  pose  estimates  imply  object  motion  and/or  random  fluctuations  of  some  kind 
in  the  feature  observable  measurements.  It  would  seem,  therefore,  that  allowable  pose 
estimate  transitions  should  be  restricted  in  some  way,  so  as  to  be  consistent  with  other 
information  about  each  object  model,  as  observed  or  known  a  priori.  This  observation, 
made  by  the  author  and  independently  by  others  [20,  136,  164],  but  little  developed  to 
date,  is  a  prime  motivation  behind  this  research. 

2.2.2  Syntactic  Pattern  Recognition.  It  happens  that  decision  theoretic  classifiers 
are  ill-posed  to  derive  information  from  the  change  of  object  observations  over  time.  In 
our  research,  we  will  exploit  these  time  domain  relationships  -  therefore,  we  now  consider 
syntactic  or  structural  classifiers,  a  class  of  algorithms  well-suited  for  considering  time 
domain  relationships.  Ultimately,  all  of  the  new  techniques  in  this  research  will  be  shown 
to  be  syntactic  classifiers.  The  following  points  are  taken  largely  from  [212],  with  a  flavor 
of  more  recent  developments  from  [90,  161],  and  examples  or  elaboration  by  the  author. 

Fundamentally,  decision  theoretic  techniques  reduce  all  information  about  an  un¬ 
classified  object  into  quantities,  and  assign  class  membership  based  on  information  about 
corresponding  quantities  for  known  object  classes.  Syntactic  pattern  recognition,  on  the 
other  hand,  looks  for  structure  or  order  in  feature  observables,  and  assigns  class  member¬ 
ship  based  on  similar  structure  for  some  known  class.  In  general,  the  input  to  a  syntactic 
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classifier  is  a  sequence  of  “features”  (often  called  primitives  or  terminals  in  syntactic  sys¬ 
tems)  from  some  unclassified  object,  which  is  compared  to  corresponding  sequences  for 
known  object  classes.  This  branch  of  pattern  recognition  arose  from  efforts  to  define 
mathematical  models  of  grammar  for  computer  analysis  of  human  language  -  hence  the 
incorporation  of  many  terms  commonly  associated  with  languages. 

First,  consider  an  example  to  illustrate  the  difference  between  decision  theoretic  and 
syntactic  classifiers.  Suppose  we  wish  to  design  a  word  recognizer,  using  individual  letters 
as  features.  A  simple  decision  theoretic  classifier  might  then  identify  the  word  CAT  by 
its  position  in  some  26- dimensional  alphabetic-quantity  feature  space  corresponding  to  the 
presence  of  one  A,  one  C,  and  one  T.  Unfortunately,  completely  unable  to  distinguish  order, 
this  classifier  would  be  unable  to  discriminate  the  word  CAT  from  the  abbreviations  for 
Tactical  Air  Command  or  Air  Training  Command  (lest  this  proposal  seem  too  ridiculous, 
note  that  a  language  recognizer,  working  in  this  way  on  relatively  long  passages  of  text, 
might  well  be  a  workable  yet  simple  proposition).  A  syntactic  word  classifier,  however, 
would  distinguish  between  the  three  “words”  by  considering  their  structure,  or  order  of 
the  letters. 

The  following  terms  are  basic  to  discussion  of  syntactic  pattern  recognition: 

(1)  An  alphabet  is  a  finite  set  of  symbols. 

(2)  A  sentence  (also  string  or  word)  over  an  alphabet  is  any  string  of  finite  length  composed 
of  symbols  from  the  alphabet. 

(3)  A  language  is  any  set  (finite  or  infinite)  of  sentences  over  an  alphabet. 

(4)  Each  language  has  a  unique  grammar,  which  describes  the  structure  of  its  sentences, 
and  can  be  defined  as  the  four  tuple  (Vn,Vt,P,S),  where 

Vn  *s  a  set  of  nonterminals  (variables); 

Vt  is  a  set  of  terminals  (constants); 

P  is  a  set  of  productions  or  rewriting  rides; 

5  is  the  start  or  root  symbol  (often  corresponding  to  an  idea  to  be  expressed). 
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Grammars  in  turn  are  assigned  into  one  of  four  type  categories  (types  0,  1,  2,  or  3) 
according  to  the  kinds  of  productions  allowed  in  each.  Productions  are  really  the  rules  of 
the  grammar  -  in  a  language  sense,  they  tell  us  how  one  set  of  variables  can  be  replaced 
by  another  set  of  variables  and/or  constants.  The  most  general  forms  of  grammai  are 
“unrestricted”  or  “type  0”  grammars,  for  which  any  set  of  variables  can  be  replaced  by 
another  set  of  variables  and/or  constants,  or  none  at  all,  without  regard  to  context  or 
previous  use  of  the  chosen  variables  and  constants  -  factors  which  are  critical  for  the  other 
grammar  types. 

The  objective  of  a  syntactic  pattern  recognition  or  classification  process  is  to  identify 
the  language  which  generated  a  given  (unclassified)  string  or  sentence  of  terminals  or 
features.  This  is  done  by  analyzing  the  string  to  determine  which  grammar  most  probably 
generated  the  string  -  a  process  called  parsing.  Parsing  techniques  are  classified  as  either 
top-down  or  bottom-up.  In  a  bottom-up  technique,  we  start  with  the  sentence  itself  and 
attempt  to  subdivide  it  into  units  which  reveal  its  grammatical  structure.  In  a  top-down 
technique,  we  start  with  each  language /grammar  set  and  attempt  to  construct  the  observed 
sentence,  ultimately  determining  which  grammar  will  most  readily  do  so. 

As  a  simple  example  of  a  top-down  technique,  suppose  that  we  have  a  sequence  of 
spoken  sounds  in  an  unknown  language  which  we  believe,  based  on  a  priori  information,  to 
have  the  English  meaning,  “ the  car  is  broken.”  We  wish  to  determine  the  origin  language. 
Using  a  top-down  parsing  procedure,  we  start  with  each  language  and  work  down  through 
the  grammatical  structure  (variables  like  noun ,  verb,  etc.)  to  generate  the  appropriate 
sequences  (one  per  candidate  language)  of  terminals  or  constants  -  sound  primitives.  Fi¬ 
nally,  using  sequence  comparison  techniques  such  as  those  to  be  discussed  in  Sect.  2.4.2, 
we  can  compare  each  generated  sequence  in  turn  with  the  observed  sequence,  assigning 
the  observed  sequence  to  the  language  for  which  we  find  the  closest  sequence- to- sequence 
match. 

In  addition  to  the  sequence  comparison  techniques  referred  to  above  for  grammar 
determination,  another  parsing  technique  is  provided  by  the  use  of  finite  automata  [212, 
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70,  161].  A  finite  automaton  A  over  an  alphabet  S  is  defined  as: 


A  =  (K,S,S,q0,F)  (2.3) 

where  K  is  a  finite,  nonempty  set  of  states,  £  is  a  finite  input  alphabet ,  £  is  a  mapping  of 
If  X  S  into  if,  g0  in  if  is  the  initial  state ,  and  F  C  If  is  the  set  of  final  states. 

In  practice,  one  uses  an  automaton  as  a  sort  of  “ matched  filter *  to  identify  the 
grammar  of  a  given  string  of  terminals  (constants/features/primitives)  [70:190-214].  An 
automaton  is  devised  for  each  grammar,  and  the  string  is  “read”  terminal-by-terminal  by 
each  automaton.  Each  terminal  element  in  general  causes  a  state  change  -  if  the  automaton 
rests  in  one  of  its  allowable  final  states  after  reading  the  string,  the  string  is  said  to  have 
been  accepted  by  that  automaton.  The  set  of  all  sequences  accepted  by  an  automaton 
indeed  defines  a  language,  and  there  is  a  one-to-one  correspondence  between  grammars 
and  automata.  The  use  of  automata  for  language  recognition  constitutes  a  bottom-up 
parsing  technique. 

Straightforward  extensions  to  these  ideas  lead  to  the  concept  of  stochastic  grammars 
and  automata,  reflecting,  for  example,  the  real  world  variations  and  associated  probability 
densities  with  which  any  given  idea  can  be  expressed  in  a  given  language  using  different 
combinations  of  words  and  speaker  pronunciations.  Much  more  could  be  said  about  syn¬ 
tactic  pattern  recognition  -  we  have  discussed  here  only  the  topics  and  techniques  needed 
for  later  development. 

Syntactic  pattern  recognition  approaches  for  recognition  of  dynamic  objects  and  tac¬ 
tical  targets  are  much  less  common  than  methods  using  decision  theoretic  or  “statistical” 
approaches.  A  particular  set  of  references  showing  recent  efforts  to  use  syntactic  methods 
for  radar  signal  classification  is  [49,  194]  -  an  older  reference  (about  which  more  will  be 
said  in  Sect.  2.5.7  and  elsewhere)  is  [136]. 

A  significant  development  in  syntactic  pattern  recognition  for  dynamic  object  recog¬ 
nition  was  the  1978  paper  by  Therrien  [211].  He  applied  the  concepts  of  linear  predictive 
signature  estimation  and  sequential  hypothesis  testing  [216]  to  two-class  discrimination. 
Therrien ’s  approach  was  drawn  from  the  class  of  linear  estimation  techniques  which  make 
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predictions  of  the  next  anticipated  measurement,  based  on  past  measurements,  and  com¬ 
pare  the  actual  observed  measurement  to  this  prediction  to  derive  corrections  for  the 
current  estimate,  followed  by  a  prediction  for  the  subsequent  measurement,  and  so  on. 

This  difference  between  predicted  and  observed  measurements  has  been  termed  the 
innovation  [118]  or  residual  [153:218].  Under  nonrestrictive  conditions,  for  the  correct 
association  of  estimator  and  observed  process,  as  we  will  see  in  Sect.  2.3. 1.3,  the  residual 
sequence  has  interesting  properties  which  make  it  an  indicator  of  the  likelihood  that  the 
observed  measurements  were  generated  by  an  object  with  parameters  like  those  of  the 
estimator  model. 

The  drawback  to  this  approach  is  that  dynamic  objects  and  tactical  targets  in  phys¬ 
ical  space  seldom  have  sensor  signatures  that  are  amenable  per  se  to  linear  predictive 
estimation,  as  we  will  discuss  in  Sect.  3.3.  A  major  direction  in  this  research,  however,  has 
been  to  define  approaches  for  using  techniques  similar  to  those  of  Therrien  where  possible, 
and  to  define  extensions  useful  where  linear  or  quasi-linear  methods  do  not  apply. 

In  addition  to  linear  predictive  methods,  sequential  hypothesis  testing  can  be  shown 
to  be  applicable  to  the  multisensor  fusion  algorithms  defined  in  this  research  -  determining 
how  long  sequence  comparison  or  “motion  warping”  needs  to  continue  in  any  object  recog¬ 
nition  scenario  to  make  a  decision  with  desired  low  probability  of  incorrect  classification. 

A  recent  development  in  the  area  of  syntactic  pattern  recognition  with  tactical  ap¬ 
plications  is  the  effort  by  Seibert  and  Waxman  [199,  220],  who  trained  neural  nets  to 
recognize  3-D  objects  (aircraft)  using  sequences  of  images  over  changing  aspect  angles. 
The  referenced  sources  list  related  efforts.  Seibert  and  Waxman  do  not  formally  refer  to 
their  efforts  as  syntactic  in  nature,  but  their  descriptions  of  the  factors  which  motivated 
them  in  their  chosen  direction  speak  eloquently,  if  implicitly,  of  the  existence  and  utility  of 
“grammars”  governing  the  behavior  of  real  objects  over  time.  Other  recent  developments 
in  the  area  of  syntactic  pattern  recognition  are  mentioned  in  the  following  subsection. 

2.2.3  Example:  Target  Recognition  by  High  Range  Resolution  (HRR)  Radar. 

The  preceding  part  of  this  section  has  been  oriented  toward  theoretical  methods.  Since 
the  accomplished  research  used  simulated  high  range  resolution  (HRR)  radar  as  the  fea- 
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ture  observable  sensor,  in  this  final  part  of  our  object  recognition  overview,  we  will  focus 
on  current  approaches  to  HRR  radar  target  recognition.  We  will  see  that  both  decision 
theoretic  and  syntactic  approaches  have  been  applied  to  this  problem.  The  research  ac¬ 
complished  for  this  dissertation  significantly  increases  the  theory  and  practical  scope  of 
syntactic  methods  for  HRR  radar  and  other  target  recognition  problems. 

Using  high  frequency  radar  pulses  with  specialized  wideband  waveforms,  it  is  possible 
to  isolate  radar  returns  as  a  function  of  distance  (range)  along  the  sensor-object  vector  [203, 
204,  221,  213,  78,  164].  The  returns  are  summed  (as  complex  vectors)  into  “range  bins” 
which  partition  the  distance  along  the  range  vector:  unlike  conventional  radar,  HRR  radar 
range  bins  are  small  with  respect  to  actual  object  extent  as  projected  along  the  range 
vector,  perhaps  several  hundred  range  bins  corresponding  to  the  actual  object  extent.  The 
“range  sweep,”  or  returned  waveform  function  (radar  cross  section  or  “RCS”  plotted  as  a 
function  of  range)  is  recognized  to  be  characteristic  of  each  object  class  at  each  particular 
aspect  angle,  principally  due  to  peaks  in  the  waveform  representing  collections  of  major 
scatterers  at  given  distances  along  the  sensor-object  vector  (although  not  generally  at  the 
same  location  on  the  body  of  the  object,  i.e.,  possibly  separated  in  crossrange). 

Fig.  2.1  shows  a  collection  of  three  measured  HRR  radar  signatures  for  an  unidentified 
aircraft  -  the  length  of  the  figure  is  on  the  order  of  the  size  of  an  aircraft,  so  that  the  bin 
widths  are  on  the  order  of  0.05  to  0.50  meters  (the  actual  size  of  the  aircraft  is  not  available). 
The  rearmost  part  of  the  target  aircraft  lies  to  the  left.  Each  of  these  signatures  is  actually 
the  result  of  summing  returns  from  several  dozen  individual  pulses  taken  over  a  period  of 
much  less  than  one  second.  More  will  be  said  about  these  signatures  in  Chapter  V. 

HRR  radar  signature  generation  can  be  treated  theoretically  as  the  result  of  convolv¬ 
ing  a  sine-shaped  (i.e.,  a  function  of  the  form  (sim)/a:)  radar  pulse  in  range/time  (i.e., 
a  rectangle  in  frequency  with  chosen  center  frequency  and  bandwidth)  with  an  array  of 
scatterers  represented  as  impulse  (i.e.,  Dirac  delta)  functions  in  range/time.  As  we  will  see 
in  Chapter  V,  many  effects  observed  in  simulations  and  tests  are  readily  explained  from 
this  perspective,  at  least  to  first  order. 
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Three  Typical  Signature  Realizations  (Test  Data) 


Figure  2.1.  Three  Typical  High  Range  Resolution  Radar  Signatures  -  Rear  Aspect  (test 
data  taken  from  tape  GTwllAtran.dathr,  results  of  [20],  provided  by  [166], 
with  further  processing  by  the  author) 

HRR  radar  is  subject  to  two  primary  error  sources.  First,  the  returns  from  scatterers 
in  any  given  range  bin  have  arbitrary  phase,  meaning  that  they  will  reinforce  or  cancel 
stochastically.  Complete  reinforcement  in  all  range  bins  theoretically  defines  the  maximum 
possible  signal,  or  the  “envelope”  of  the  returned  waveform.  Total  or  near  total  cancellation 
in  any  given  bin,  on  the  other  hand,  produces  a  sharp  downward  spike  called  a  “null.”  The 
stochastic  nature  of  this  inter-scatterer  phase  problem  is  exacerbated  for  real  objects,  which 
vibrate  as  they  move,  causing  scatterers  to  move  relative  to  one  another. 

Second,  signal  paths  that  include  more  than  one  scatterer  -  i.e.,  paths  which  reflect 
from  one  scatterer  to  another  (or  more)  and  then  back  to  the  sensor  -  create  returns 
that  appear  “later”  or  at  longer  range  than  we  would  expect  from  studies  that  do  not 
consider  multiple-scatterer  returns.  Since  multiple-scatterer  interaction  models  provide 
a  combinatorial  expansion  in  simulation  complexity,  many  models  do  not  address  this 
problem  [166,  64].  Cavities  on  the  object  Eire  a  particularly  troublesome  source  of  such 
noise. 
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It  should  also  be  noted  that  HRR  radar  does  not  provide  “in-plane  rotation-invariant” 
(see  “PSRF  in  App.  A)  signatures  or  features  -  that  is,  the  HRR  signature  is  a  function 
of  the  “roll”  angle  of  the  target  relative  to  the  sensor  around  the  sensor-to-target  vector. 
This  fact  is  due  to  the  effects  of  waveform  polarization  at  different  target  rotation  angles. 
For  small  errors  in  sensor- target  roll  angle  knowledge,  effects  on  target  recognition  due 
to  incorrect  polarization  assumptions  are  not  considered  separately  from  the  other  (more 
significant)  error  sources  noted  above. 

HRR  radar  techniques  are  related  to  Inverse  Synthetic  Aperture  Radar  (ISAR)  imag¬ 
ing  techniques  in  the  sense  that  ISAR  uses  the  amplitude  and  phase  of  the  return  in  each 
range  bin,  whereas  classical  HRR  radar  ignores  phase  information.  Also,  whereas  HRR 
radar  techniques  yield  an  identification  directly,  classical  ISAR  yields  only  a  crude  image, 
which  then  is  passed  to  an  image-matching  algorithm  for  actual  identification  [221,  52], 

A  contract  between  the  USAF’s  Wright  Laboratory  and  General  Dynamics  (GD)  was 
one  of  the  first  detailed  efforts  in  HRR  radar,  and  GD’s  results  as  expressed  in  [20,  94, 
78,  213]  form  the  basis  for  the  following  part  of  this  discussion.  Related  efforts  are  now 
underway  with  Hughes  and  Westinghouse  under  separate  contracts  [166]. 

The  GD  HRR  recognition  approach  starts  with  the  gathering  of  radar  range  sweeps 
for  actual  targets  at  discrete  intervals  (“granularity”)  of  aspect  angle  and  polarization. 
Ideally,  we  have  multiple  measurements  for  each  aspect  angle  and  polarization  over  the 
entire  4 7r  steradian  extent  of  target  aspect  angle  (i.e.,  the  hypothetical  target  aspect  angle 
sphere,  as  in  Fig.  1.2).  Each  sweep  is  then  processed  to  extract  its  significant  peaks, 
recording  them  in  terms  of  range  (along  the  sensor-target  vector)  and  amplitude.  One 
set  of  such  “reduced”  measurements  (one  “reduced”  range  sweep  per  discrete  aspect  angle 
value)  is  chosen  to  represent  the  target  library  set  (“set  one”).  The  remaining  reduced 
measurements,  or  members  of  “set  two”,  are  used  in  training  the  algorithm. 

The  “slide  distance  metric”  [20,  94,  78]  is  used  (1)  in  actual  application  or  testing 
to  define  distances  (i.e.,  measures  of  difference)  between  elements  in  set  one  (the  library) 
and  the  measurements  from  an  unclassified  target,  and  (2)  in  training  to  define  distances 
between  elements  in  sets  one  and  two.  During  the  training  process,  comparisons  of  elements 
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in  set  one  and  set  two  are  used  to  establish  statistics  whereby,  in  actual  application,  the 
slide  distance  results  from  a  given  measured  signature  can  be  used  to  define  probabilities 
of  class  membership  for  the  unknown  target. 

In  application,  the  metric  is  repeatedly  applied  while  “sliding”  an  unknown  target’s 
reduced  sweep  in  range  relative  to  a  library  reduced  sweep  to  align  the  two  range  sweeps  in 
each  of  four  different  ways  -  aligning  the  first  peak  of  the  unknown  sweep  to  the  first  peak  of 
the  library  sweep,  aligning  the  second  peak  of  the  unknown  sweep  to  the  second  peak  of  the 
library  sweep,  and  so  on  through  a  “fourth  peak-to-fourth  peak”  match.  An  acceptable 
peak  match  is  defined  as  one  for  which  the  amplitude  difference  for  the  matched  peaks 
is  less  than  or  equal  to  some  defined  value.  Following  the  initial  match,  the  remaining 
peaks  in  the  two  sweeps  are  paired,  again  subject  to  the  amplitude  difference  criterion, 
and  (since  remaining  peaks  will  not  in  general  be  aligned  horizontally,  i.e.,  in  range)  a 
horizontal  difference  criterion. 

For  a  comparison  between  any  two  reduced  sweeps,  this  process  results  in  four  candi¬ 
date  alignments,  each  of  which  involves  matched  peaks,  unmatched  peaks  in  the  unknown 
sweep,  and  unmatched  peaks  in  the  library  sweep.  The  slide  distance  d  of  each  candidate 
alignment  is  computed  using  a  formula  given  in  [94]: 

J  (Tx  +  T2  +  T3y'2 

wlLl 

where: 

Ti  =  (summation  is  over  all  matched  peak  pairs) 

T2  =  Fpen  x  [10 log(lQ}2 
T3  =  Fpen  x  [10 logiR,)}2 
FPcn  —  a  constant  penalty  factor 
Wb-.a  —  bin-to-amplitude  weight  (a  constant) 

6bin  =  the  difference  in  range  bin  locations  for  two  peaks  that  have  been  paired 
{amp  =  the  difference  in  amplitudes  for  two  peaks  that  have  been  paired 


2-12 


Nmpp  =  the  number  of  matched  peak  pairs 

Ru  =  ratio  of  power  of  matched  peaks  in  unknown  sweep  to  power  of  till  peaks  in 
that  sweep 

Ri  =  ratio  of  power  of  matched  peaks  in  library  sweep  to  power  of  all  peaks  in  that 

sweep 

and  the  “power”  of  a  set  of  peaks,  as  discussed  in  the  definitions  for  Ru  and  Ri,  is  defined 
as  the  sum  of  the  peak  amplitudes  in  square  meters. 

Finally,  the  smallest  slide  distance  of  the  four  candidate  alignments  is  chosen  as  the 
final  slide  distance  between  the  two  reduced  sweeps.  Finer,  bin-by-bin  realignment  and 
slide  distance  reduction  is  possible  beyond  this  point,  and  is  discussed  in  [20],  but  is  not 
implemented  in  the  code  provided  to  this  author.  One  may  observe  that  this  process 
achieves  a  kind  of  correlation  between  the  two  sweeps. 

Another  HRR  radar  signature  metric  is  the  MFML,  or  Minimum  Feature,  Maximum 
Likelihood  metric  [20]  developed  by  Cyberdynamics  under  contract  to  General  Dynamics. 
This  metric  has  given  good  results  in  recent  research  by  Cyberdynamics,  subsequent  to 
the  original  General  Dynamics  work  [18].  Efforts  to  date  by  Westinghouse  under  contract 
to  WL/AARA  [166]  have  employed  neural  nets  to  analyze  individual  signatures  (i.e.,  the 
classification  “metric”  is  defined  by  the  net  during  training). 

It  is  significant  for  our  purpose  to  note  that  most  of  the  referenced  sources  [94,  78,  213] 
do  not  discuss  target  kinematics  in  any  context,  particularly  with  regard  to  determining 
the  fundamental  feasibility  of  the  maneuvers  which  are  implied  by  the  sequences  of  as¬ 
pect  angles  selected  during  the  association  process  discussed  in  the  preceding  paragraphs. 
However,  the  final  report  for  the  General  Dynamics  effort  [20]  gives  equations  for  “conical 
uncertainty,”  used  to  size  search  windows,  and  the  report  notes  on  page  4-2,  with  regard 
to  the  MFML  metric,  that  “progression”  (in  aspect  angle)  of  aspect  angle  (pose)  estimates 
derived  from  the  recognition  process  could  be  used  to  reduce  the  probability  of  misidenti- 
fication,  since  the  correct  target  model  will  show  “smooth  progression.”  The  report  notes 
that  “this  capability  has  not  yet  been  implemented.” 
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Finally,  an  additional  metric  is  provided  from  references  by  Fukunaga  [92j,  Mieras 
[164],  Mitchell  [166],  and  Stewart  [207].  This  metric  is  a  Mahalanobis  metric,  as  defined 
in  [212],  i.e.,  a  quadratically- weighted  squared  distance  of  the  form  [z„  —  zu]TSj1[zu  —  xu], 
where  z„  is  a  vector  of  range  bin  returns  for  an  unknown  target,  z„  is  a  vector  of  mean  range 
bin  returns  for  a  known  aspect  angle  on  a  known  target  of  class  u>,  and  S”1  is  the  inverse 
covariance  matrix  corresponding  to  the  mean  vector  z„ .  For  a  process  u;  with  the  Gaussian 
classical  likelihood  p(z  |  u;)  of  mean  zu  and  covariance  Su,  the  Mahalanobis  distance  M  is 
the  key  argument  in  the  “log  likelihood”  of  observing  an  event  z„,  or  p(z„  |  a;).  That  is, 
ln[p(zu  |  w)]  =  C  -  (0.5  X  M),  where  C  is  a  function  of  S„  which  can  often  be  taken  as 
a  constant.  The  straightforward  multivariate  Gaussian  origin  of  the  Mahalanobis  metric 
makes  it  readily  applicable  in  probabilistic  approaches,  and  this  metric  is  the  primary 
signature  metric  investigated  in  this  research. 

Despite  their  common  use  of  this  metric,  the  users  noted  in  the  previous  paragraph 
have  applied  it  in  significantly  different  fashions.  Fukunaga  uses  a  high  range  resolution 
radar  database  gathered  for  two  automobile  classes  (a  sedan  and  a  van)  at  various  aspect 
angles  to  illustrate  concepts  in  statistical  pattern  recognition.  This  work  makes  no  use  of 
motion  information  for  a  priori  limiting  (conditioning)  of  aspect  angle.  Unlike  the  other 
HRR  radar  efforts  discussed  prior  to  this  point,  the  method  of  Mieras  (Raytheon)  [164, 165] 
for  HRR  radar  aircraft  target  recognition  is  syntactic  in  nature,  and  intimately  related  to 
efforts  in  this  dissertation  (to  be  discussed  further  in  Sect.  3.8.1).  Maximum  likelihood 
methods  are  proposed  by  Mieras  to  work  around  uncertainties  in  multiplicative  scale  factor 
and  range  registration  (i.e.,  bin  alignment  on  the  unknown  target).  Finally,  Stewart  [207] 
reminds  us  that,  where  the  covariance  matrix  is  assumed  to  be  diagonal  with  equal  variance 
in  each  bin,  the  range  registration  process  is  a  correlation  in  the  range  domain,  and  is  thus 
more  easily  implemented  in  the  frequency  domain  as  a  complex  conjugate  multiplication. 

A  fundamentally  syntactic  method  for  HRR  radar  target  recognition  using  neural 
nets  was  proposed  by  Farhat  [81].  He  defined  neural  network  classifiers  using  inputs  of 
“ sinograms ,”  or  sequences  of  high  range  resolution  radar  sweeps,  as  generated  by  constant 
angular  rate  changes  of  target  aspect  angle  with  respect  to  the  sensor  (i.e.,  as  with  the 
motion  of  a  target  model  on  a  turntable).  Farhat’s  published  work  does  not  discuss, 
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however,  how  he  would  deal  with  all  of  the  possible  “sinograms”  that  could  be  generated 
by  all  possible  targets  moving  over  all  possible  (not  generally  constant  rate)  trajectories  - 
nor  does  he  address  the  use  of  kinematic  information  available  from  the  radar  sensor. 
Presumably,  a  properly  structured  and  trained  neural  net  would  generalize  to  some  degree, 
but  Farhat’s  approach  does  not  clearly  offer  strong  promise  of  being  able  to  assign  an 
observed  sequence  of  range  sweeps  unambiguously  to  the  correct  target  class. 

Air  Force  Institute  of  Technology  students  DeWitt  [68]  and  Kouba  [128]  have  recently 
investigated  the  use  of  syntactic  techniques  (hidden  Markov  models  and  recurrent  neural 
nets,  respectively)  for  HRR  radar  target  recognition.  It  is  believed  that  their  methods  could 
be  used  in  combination  with  concepts  discussed  in  this  research  to  design  fully  functional 
automatic  object  recognition  systems. 

2.2.4  Pattern  Recognition  -  Conclusion.  This  completes  our  overview  of  pat¬ 
tern  recognition  and  conventional  approaches  to  automatic  object  recognition.  Following 
sections  will  consider  theory  and  techniques  that  we  will  ultimately  fold  back  into  ob¬ 
ject  recognition  algorithms,  in  attempts  to  make  better  use  of  information  that  is  often 
available,  but  ignored  in  decision  making. 

2. 3  Object  Tracking  /  State  Estimation 

The  science  of  object  tracking  and  state  estimation  grew  out  of  efforts  to  use  discrete 
radar  observations  to  develop  continuous  object  tracks,  or  estimates  of  object  kinematic 
state.  The  research  described  in  this  dissertation  in  a  sense  reverses  those  efforts,  using 
the  kinematic  state  estimates  to  recognize  the  object  class  that  generated  the  observations. 
First,  it  is  essential  to  understand  what  information  is  available  from  object  tracking. 

The  most  significant  milestone  in  object  tracking  development  was  the  development 
of  the  Kalman  filter  [113,  119,  153,  198]  for  optimal  state  estimation  of  systems  that  can  be 
adequately  described  by  linear  dynamics  models  with  forcing  functions  of  (heuristically) 
white  Gaussian  noise,  given  measurements  corrupted  by  the  same  form  of  noise.  The  linear 
system  structure  of  the  Kalman  filter  made  it  immediately  applicable  to  multisensor  fusion 
of  kinematic  information,  and  special  forms  -  the  linearized  and  extended  Kalman  filters  - 


2-15 


were  developed  to  handle  nonlinear  dynamics  and/or  measurement  equations  [154].  With 
particular  choices  of  models  and  noise  assumptions  for  given  objects,  the  extended  Kalman 
filter  forms  the  core  of  nearly  every  object  tracking  software  system  in  use  today. 

2.3.1  The  Kalman  Filter  and  Other  Estimators.  The  purpose  of  this  section 
is  to  delineate  the  equations  for  the  conventional  Kalman  filter  and  the  modifications  to 
those  equations  necessary  in  the  case  of  the  extended  Kalman  filter.  The  form  of  the 
equations  will  be  that  for  a  continuous  dynamic  system  for  which  discrete  (sampled  data) 
measurements  are  available  (the  most  common  case  in  general  and  for  tracking  systems 
in  particular).  The  last  portions  of  this  section  discuss  optimal  smoothing  of  Kalman 
filter  estimates,  parameter  estimation  with  multiple  model  Kalman  filters,  and  nonlinear 
estimators.  This  information  is  taken  from  texts  by  Maybeck  [153,  154],  in  some  cases  as 
previously  summarized  by  the  author  [143]. 

2.3. 1.1  The  Kalman  Filter.  The  Kalman  filter  derives  the  optimal  estimate 
for  a  linear  dynamic  system  described  by: 

x(t)  =  F(t)x(t)  +  B(t)u(t)  +  G(t)w(f)  (2.5) 

and  which  has  discrete  linear  measurements  corrupted  by  white  Gaussian  noise: 

z{ti)  =  H  (ti)x(ti)  +  v(ti)  (2.6) 

where: 

x  =  an  n-dimensional  system  state  vector 

F  =  an  n  x  n-dimensional  plant  dynamics  matrix 

B  =  an  n  x  r-dimensional  deterministic  input  matrix 

u  =  an  r-dimensional  vector  of  deterministic  inputs 

G  =  an  n  x  s-dimensional  dynamics  driving  noise  distribution  matrix 

w  =  an  s-dimensional  vector  of  white  Gaussian  dynamics  driving  noises 


z  =  an  m- dimensional  measurement  vector 


H  =  an  m  X  n- dimensional  measurement  matrix 
v  =  an  m-dimensional  white  Gaussian  noise  vector 
t  =  time  (function  argument) 
ti  -  ith  sample  time  (function  argument) 

The  white  Gaussian  noise  vectors  w (t)  and  y(U)  are  assumed  to  be  zero-mean,  in¬ 
dependent,  and  to  have  covariances  defined  by: 

fJ[w(t)wT(t  +  r)]  =  Q(^(r)  (2.7) 


E[v(ti)vT(ti)]  =  R(ti)£j, 


(2.8) 


where: 

E  =  the  expectation  operator 

Q(t)  =  an  s  X  s-dimensional  positive  semidefinite  matrix  describing  the  strength  of 
the  dynamic  driving  noise  vector,  w (t),  at  time  t 

R(tj)  =  an  m  x  m-dimensional  symmetric  positive  definite  matrix  -  the  covariance 
of  the  measurement  error  at  time  tt.  See  [153:216]  for  further  discussion  on  the  assumption 
of  positive  definiteness  for  R(f.  ) 

6(t)  =  the  Dirac  delta  function,  equal  to  infinity  for  r  =  0  and  zero  elsewhere,  with 
area  of  unity  when  integrated  over  all  r  (units  of  time-1) 

tiij  =  the  Kronecker  delta,  equal  to  one  for  i  =  j,  zero  otherwise  (dimensionless) 

For  these  systems,  the  optimal  estimate  is  computed  after  each  measurement  as: 

x(*,+)  =  x(<r)  +  K(tj)  [z,  -  H(tj)x(tr)]  (2.9) 
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and  the  estimated  covariance  of  the  (zero- mean)  error  for  this  estimate  is: 


p(C)  =  p(*r)-K(«i)H(*i)p(*r) 


(2.10) 


where: 

x(t+ )  =  estimated  value  of  system  state  variables  of  interest  after  measurement 
update  at  time 

x(tf)  =  estimated  value  of  system  state  variables  before  update  at  time  tt 

K(f,)  =  Kalman  filter  gain  at  time  U 

=  observed  measurement  values  at  time  t, 

H(<i)  =  measurement  matrix  at  time  t,  (post-multiplied  by  x(t~  )  to  give  the  predicted 
measurement  value  for  time  t, ) 

[zj  -  H(f,)x(t~)]  =  the  filter  residual,  or  difference  between  observed  and  predicted 
measurements  at  time  t{ 

P (tf )  =  estimated  error  covariance  after  measurement  update  at  time  t, 

p(*r)  =  estimated  error  covariance  before  measurement  update  at  time 
and  K(tj)  is  found  using  the  relation: 

K(t<)  =  P(tr)HT(ti)  [H(ti)P(<r)HT(ti)  +  R(ti)]-1  (2.11) 

for  which  all  elements  are  defined  above. 

The  elements  of  x  and  P  at  time  are  obtained  by  propagating  the  estimates  of 
those  quantities  forward  in  time  from  the  previous  set  of  values  according  to  the  state 
dynamics: 

x(<  |  1)  =  F(f) x(t  1 1^)  +  B(t)  u(t)  (2.12) 

and 

P (t  |  U- 1)  =  F(t)P(t  I  U-i)  +  P(*  I  *»-i)FT(t)  +  G(t)Q(t)GT(t)  (2.13) 
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where  x(f  |  fi_t)  =  the  estimated  values  of  system  state  variables  at  time  t  based  on 
measurements  through  time  lf  and  the  meanings  of  other  variables  may  be  inferred 
from  earlier  discussion.  The  initial  state  value  is  modelled  as  a  random  vector  normally 
distributed  with  mean  x0  and  covariance  P0.  Often  in  the  case  of  radar  tracking,  x0  is 
derived  from  the  initial  measurements  and  P0  is  a  function  of  the  e±ror  statistics  of  those 
measurements  [10:80-82]  [35:153-154]  [143]. 

Note  in  particular  that  the  accuracy  of  a  Kalman  filter  state  estimate  so  generated 
is  completely  dependent  on  the  extent  to  which  the  filter  dynamics,  measurement,  and 
noise  statistical  models  reflect  the  true  processes.  In  most  cases,  the  true  processes  are 
either  not  fully  understood  or  cannot  be  modelled  exactly,  or  to  do  so  would  result  in 
unacceptable  filter  computational  requirements.  In  either  case,  the  performance  of  a  pur¬ 
posefully  reduced-order  and  simplified  filter  can  only  be  proven  by  actual  use,  but  ideally 
it  should  be  tested  prior  to  implementation  by  simulating  operation  with  a  high  fidelity 
truth  model  [153:289-291]. 

Consider,  on  the  other  hand,  a  process  for  which  dynamics  and/or  measurements 
cannot  be  described  by  the  linear  Eqns.  (2.5)  and  (2.6)  above.  An  “extended”  Kalman 
filter  routine  can  be  written  for  a  system  in  which  it  is  possible  to  express  dynamics  by  an 
equation  of  the  following  more  general  form: 

x(<)  =  f[x(t),  u(*),  t  ]  +  G(<)w(f)  (2.14) 


where: 

x  =  an  n-dimensional  system  state  vector 
f  =  an  n-dimensional  vector  dynamics  function 
u  =  an  r-dimensional  vector  of  deterministic  inputs 
t  =  time 


G  =  annx  a-dimensional  dynamics  driving  noise  distribution  matrix 
w  =  an  a-dimensional  vector  of  white  Gaussian  dynamics  driving  noises 
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and  where  measurements  can  be  expressed  as: 


z{U)  =  h[x(ti),  ti\  +  v(t<)  (2.15) 

where  h  =  an  m-dimensional  vector  measurement  function,  and  other  values  are  as  defined 
earlier.  Note  that  for  this  development,  the  zero-mean  white  Gaussian  noises  w(t)  and  v(t.) 
are,  as  before,  required  to  be  independent  of  each  other  and  to  enter  the  system  linearly.  As 
is  the  case  with  the  conventional  (linear)  Kalman  filter,  the  extended  Kalman  filter  for  the 
case  of  continuous  dynamics  and  discrete  updates  is  implemented  as  a  series  of  alternating 
updates  and  propagations  of  the  state  estimate  and  filter-compute  •  \  ->r  covariance. 

The  updated  state  vector  and  covariance  matrix  Eire  propagated  forward  in  time  to 
the  next  update  using  the  relations: 

x(<  |  ti_i)  =  f[x(t  |  t<_i),u(t),<]  (2.16) 


and 


P(*  I  *i-i)  = 

F[x(t  |  |  ti-0  +  P(t  |  |  ti.^t]  +  G(t)Q(t)GT(t)  (2.17) 

where  all  quantities  have  been  defined  earlier  except: 

F[x(t  |  <i_i),t]  =  an  7i  X  n  matrix  of  partial  derivatives,  drl*£(t)’tI  |x_^t|t.  ^ 

Updates  are  performed  using  the  following  equations: 

x(t+)  =  x(t~ )  +  K(fj)  {zj  -  h[x(t- ),<<]}  (2.18) 


and 

P (*t)  =  P(*D  -  K((,)H[x(i-),(,lP((-)  (2.19) 

where: 

h[x(t,"  ),tj]  =  the  predicted  measurement  value  at  time 
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{zj  -  h[x(tj  ),  <j]}  =  the  filter  residual,  or  difference  between  observed  and  predicted 
measurements  at  time  tit  in  the  form  appropriate  for  the  extended  Kalman  filter 

=  an  m  X  n  matrix  of  partied  derivatives,  |x=*(tr) 

and  K  is  computed  as  given  earlier  using  the  value  of  H  computed  above.  Other  values 
are  as  defined  earlier. 

The  matrix  partials  F  and  H  represent  the  successive  relinearization  about  newly 
declared  nominal  values,  i.e.,  the  latest  best  estimates  of  system  state.  Note  therefore  that, 
unlike  conventional  (and  linearized)  Kalman  filters,  the  extended  Kalman  filter  relations  for 
updating  and  computing  the  filter-computed  error  covariance  are  functions  of  the  estimate 
values.  This  dependence  is  unfortunate  in  that  it  forbids  calculation  of  the  actual  filter 
gains  and  covariances  a  priori,  an  on-line  computation- saving  process  which  can  be  done 
for  the  linear  filter.  Pre-computed  and  constant-gain  extended  Kalman  filters  have  been 
constructed  using  gain  calculations  from  reasonably  well-known  nominal  state  trajectories, 
however  [154:57]. 

2.3. 1.2  Forward-Backward  Kinematic  State  Estimation:  Smoothing.  In 
most  tracking  applications,  we  can  accept  a  current  estimate  of  the  system  (object)  state, 
based  or  conditioned  on  the  prior  and  current  measurement  history.  In  some  applications, 
however,  we  need  to  define  the  best  estimate  of  system  state  at  each  point  in  time  along 
some  portion  of  a  trajectory,  with  the  estimate  at  any  time  conditioned  not  only  on  prior 
and  current  measurements,  but  also  on  measurements  taken  after  that  time.  The  process 
of  obtaining  this  estimate  is  called  smoothing  [154,  159,  160].  A  typical  requirement  for 
smoothing  arises  in  post-test  analysis  for  antiaircraft  missile  test  intercepts  (without  war¬ 
head  detonation),  where  a  best  estimate  of  miss  distance  is  required  for  lethality  assessment, 
based  on  the  complete  measurement  history  of  the  engagement,  i.e.,  for  measurements  both 
before  and  after  the  missile  passes  by  the  target  aircraft. 

An  analogous  motivation  will  exist  for  the  research  discussed  here  -  in  part,  this 
research  requires  an  estimate  of  aspect  angle  based  solely  on  aircraft  kinematics  (more 
precisely,  measurements  of  translation  states).  Under  normal  circumstances,  as  will  be 
discussed  in  Chapter  V,  aircraft  orientation  can  be  reasonably  well  estimated  as  a  function 
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of  aircraft  velocity  and  acceleration  -  acceleration  commands  (implemented  directly  by 
change  of  pitch,  roll,  yaw,  thrust,  or,  if  the  design  allows,  drag)  being  an  unknown  input 
by  the  pilot.  Our  kinematic  measurements,  however,  are  generally  of  position  only  (perhaps 
position  and  doppler-derived  velocity),  and  estimating  velocity  and  acceleration  inherently 
require  us  (at  least  conceptually,  if  not  explicitly  in  the  algorithm  itself)  to  differentiate 
our  position  information  once  and  twice,  respectively.  This  implicit  twofold  differentiation 
and  unpredictable  pilot  inputs  mean  that  our  ability  to  estimate  acceleration  at  some  time 
U  from  prior  and  current  position  measurements  is  quite  limited.  If  we  are  willing  to 
wait,  however,  measurements  of  position  for  times  beyond  time  will  reveal  much  about 
the  acceleration  and  orientation  at  time  tj.  Thus,  we  may  choose  to  make  a  series  of 
radar  measurements  for  an  aircraft  undergoing  some  maneuver,  and  then  estimate  the 
most  likely  aspect  Jingle  for  each  point  in  time  during  that  maneuver,  conditioned  both  on 
measurements  before  and  shortly  after  that  time  point.  For  this  purpose,  we  will  require 
a  smoother. 

There  are  three  main  forms  of  smoothers,  based  on  measurement  information  avail¬ 
able  and  state  estimate  desired  [154:1-18]:  fixed  interval,  fixed  point,  and  fixed  lag.  For 
the  fixed  interval  smoother,  we  assume  the  availability  of  a  set  of  measurements  over  some 
time  interval,  and  determine  the  best  estimate  of  the  state  for  each  point  in  time  during 
that  interval,  conditioned  on  all  of  the  measurements  taken  dining  that  interval.  For  a 
fixed  point  smoother,  we  seek  an  estimate  of  the  state  at  one  time  tp  only,  conditioned 
on  measurements  before  and  after  time  tp  up  until  the  most  recent  time,  continuing  to 
improve  the  estimate  with  measurements  until  some  final  time  tf.  Finally,  for  a  fixed  lag 
smoother,  we  find  an  estimate  for  the  state  at  each  point  along  the  trajectory,  conditioned 
on  measurements  prior  to  that  point  in  time  and  measurements  for  some  fixed  time  interval 
(lag)  beyond. 

For  the  purpose  considered  here,  the  last  form,  or  fixed  lag  smoother,  is  most  relevant, 
and  only  this  form  will  be  discussed  here.  We  will  generally  expect  to  maintain  track  on 
an  aircraft  target  for  a  short  time  before  selecting  a  time  period  over  which  to  perform 
a  target  recognition  process  (a  period  selected,  for  example,  because  the  target  executes 
a  revealing  maneuver  -  one  for  which  we  expect  significant  differences  between  signatures 
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for  potential  target  classes).  Information  collected  prior  to  the  period  of  interest  should  be 
made  available  to  the  estimator.  The  reader  might  point  out  that  this  implementation  by 
definition  imposes  an  added  time  lag  beyond  that  required  for  a  fixed  interval  smoother, 
but  if  this  time  lag  is  critical,  we  have  options  for  producing  smoothed  estimates  of  the  state 
for  times  during  the  lag  period  as  well  (e.g.,  a  set  of  fixed  point  smoothers  for  constantly 
changing  points  in  time  during  the  lag). 

Equations  and  procedures  for  implementing  the  fixed  interval  smoother  are  shown  in 
App.  C  in  the  interest  of  saving  space  here.  These  equations  are  provided  in  linear  Kalman 
filter  format,  as  presented  in  [154:16-17],  but  no  significant  difficulties  are  encountered  in 
applying  them  under  extended  Kalman  filter  assumptions. 

2. 3. 1.3  Residual  Analysis  and  Multiple  Model  Filtering.  A  well-known 
application  for  Kalman  filters  [154:129-136]  is  the  estimation  of  parameters  for  state  dy¬ 
namics  (“plant”),  measurement,  and/or  noise  models,  simultaneously  with  state  estimation 
itself,  for  systems  from  which  measurements  are  available,  but  for  which  true  parameter 
values  are  unknown.  Note  that  the  distinction  between  states  and  parameters  made  here 
is  taken  from  Maybeck  [154:69],  essentially  that  parameters  and  states  taken  together  are 
variables  that  define  a  system  and  its  activity,  but  parameters,  if  they  vary  at  all,  vary 
slowly  enough  to  be  considered  essentially  constant  for  periods  of  interest  to  the  estimator. 
In  our  case,  for  example,  we  might  recognize  the  physical  structure  of  a  tactical  target 
as  defining  its  parameters,  while  its  six  degree-of- freedom  (“6  DOF”  -  three  translational 
and  three  angular  degrees  of  freedom)  motion  through  physical  space  over  time  defines  its 
state  history. 

When  a  discrete,  finite  number  of  likely  parameter  vectors  (values)  can  be  identified 
a  priori,  we  cam  define  a  set  of  Kalman  filters  according  to  these  model  parameters,  and  use 
the  filters  in  a  multiple  model  adaptive  estimator  (MMAE)  structure  to  estimate  the  true 
parameter  vector  for  the  unknown  system.  We  will  use  the  variable  here  to  represent 
sets  or  vectors  of  parameter  values,  recognizing  that  u>  was  used  earlier  in  this  chapter  to 
represent  object  classes  for  pattern  recognition.  This  choice  is  completely  intentional.  One 
of  the  basic  themto  of  this  research  is  that  an  “object”  is,  in  an  abstract  sense,  simply  a 
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point  or  vector  of  parameters  in  some  infinite  dimensional  parameter  space,  which  we  must 
approximate  or  model  as  finite  dimensional  in  practice. 

Parameter  identification  using  sets  of  Kalman  filters,  or  multiple  model  estimation,  is 
accomplished  by  analyzing  the  residual  sequences  from  the  respective  filters.  Filter  residu¬ 
als  are  the  quantities  (z*  -  H(fjt)x(f*  )},  for  a  linear  Kalman  filter,  or  {z*  -  h[x(f*  ),  fjr  ]}  for 
an  extended  Kalman  filter  -  that  is,  the  differences  between  actual  and  expected  measure¬ 
ments  at  each  time  tk.  These  residuals  were  respectively  the  quantities  in  square  brackets  in 
Eqn.  (2.9)  and  braces  in  Eqn.  (2.18)  which  provide  new  information  to  recursively  update 
the  current  state  estimate. 

For  a  linear  filter  with  the  classical  Gaussian  noise  assumptions  and  proper  model 
parameters  u>i,  the  residual  sequence  over  time  can  be  shown  to  be  zero  mean,  white  (in¬ 
dependent  in  time),  and  Gaussian,  with  covariance  A j(t*)  =  H(tfc)P(f*  )HT(t*)  +  R(t*). 
Equivalently,  we  can  say  that  we  expect  the  measurements  z*  to  have  the  probability  den¬ 
sity  function  or  likelihood  p( zk  |  Zk^lt  o>j),  which  has  a  mean  of  H(tt)x(f*  )  and  covariance 
H(f*)P(t*  )HT(tt)  +  R(tt).  These  properties  tend  to  hold  true  (to  first  order)  as  well 
for  cases  well-estimated  by  linearized  and  extended  Kalman  filters  (in  the  latter  case,  the 
mean  is  given  by  h[x(£*  )]  rather  than  H(f*)x(£*  )). 

It  is  extremely  important  to  recognize  that  the  likelihood  p(zk  \  Zk_1,u)l)  is  explicitly 
a  measure  of  the  joint  likelihood  that  the  system  (parameter  set  modelled  by  the  Kalman 
filter  could  simultaneously  output  the  elements  of  the  observed  measurement  vector  z*, 
having  previously  output  the  measurement  vector  sequence  Zk-i.  The  utility  of  residual 
analysis  is  that  measurements  from  a  system  with  parameters  <*/,,  processed  by  a  filter 
designed  for  parameters  u>{  ^  w,,  will  not  in  general  exhibit  the  proper  residual  mean  (i.e., 
zero)  or  covariance.  This  deviation  betrays  an  improper  model  choice. 

At  this  point,  we  need  to  observe  that  this  approach  is  intimately  related  to  the 
method  of  Therrien  [211]  [9:177-178]  noted  in  Sect.  2.2.2  above.  Therrien  observed  that 
a  wider  class  of  tools  than  simply  the  Kalman  filter  are  available  which  produce  residual 
sequences  having  the  properties  noted  above.  His  approach,  however,  was  oriented  toward 
linear  prediction  of  sensor  signatures  for  residual  analysis.  In  cases  of  interest  to  us,  as  we 
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will  see  in  the  following  chapter,  linear  prediction  of  sensor  signatures  is  not  generally  fea¬ 
sible,  yet  we  desire  to  exploit  the  information  contained  in  these  signatures.  In  subsequent 
chapters,  this  research  will  extend  residual  analysis  for  object  recognition  using  kinematics 
and  sensor  signatures,  where  linear  prediction  of  signatures  is  not  reasonable  or  feasible. 

Continuing  the  discussion  of  classical  multiple  model  parameter  estimation,  note  that 
given  a  priori  knowledge  of  the  likelihood  of  parameter  sets  u ;7,  Bayes’  Rule  can  be  used 
to  find  an  a  posteriori  probability  of  the  presence  of  each  parameter  set,  conditioned  on 
the  observed  measurements  and  a  priori  knowledge: 

I  2  \  =  {rin=l[p(zn  1  Zn-l,^)]}p(^) 

‘  k)  [r(z»  I  Zn-i, <*»>)]} P(uj) 

for  i  =  1, 2, . . . ,  J,  where: 

Z„_i  =  a  set  of  n  —  1  measurement  vectors  z 

z„  =  a  vector  of  measurements  available  at  time  tn 
and  other  variables  are  as  defined  earlier. 

An  iterative  version  of  Eqn.  (2.20)  which  converges  in  practice  [154:133]  to  an  a 
posteriori  probability  of  one  for  the  true  parameter  set  u  is  given  by: 


plu,,  I  Z.)  =  Pfe|Z»-..^)p(^IZ»-,) _ 

P(z*  I  Zfc-i >“>;)?(“>;  I  Zfc-i) 

where: 

p( Wj  |  Z0)  =  (the  a  priori  probability  of  class  w7) 

and  other  variables  are  as  defined  earlier. 


(2.21) 


Note  that  in  the  absence  of  specific  information  on  p(u>} )  for  each  j,  they  may  be 
assumed  equal,  or  effectively  ignored.  This  development  is  entirely  analogous  to  that  of 
Eqn.  (2.2). 

Conceptually,  Fig.  2.2  expresses  the  way  in  which  this  multiple  model  parameter 
estimation  process  is  conducted.  Note  that  this  figure  and  the  preceding  discussion  have 
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p{u  1  I  Zfc) 
p(^2  |  Zfc) 

o 

O 

o 

I  Zfc) 


Where:  z*  =  the  k-th  element  in  measurement  sequence  Zfc 

ujj  —  the  j-th  class  of  J  target  classes 
Tj  =  the  residual  vector  for  class  j 
A  j  —  the  residual  covariance  for  class  j 
p(cjj  |  Zfc)  =  the  a  posteriori  probability  of  class  j 

Figure  2.2.  Multiple  Model  Estimation  Algorithm 


not  considered  the  fact  that  it  is  possible  and  often  desirable  to  obtain  a  single  “over¬ 
all”  estimate  for  the  state  x*  and  parameter  set  w,  conditioned  on  the  individual  state 
estimates  provided  by  all  of  the  individual  estimators,  and  the  likelihoods  of  their  respec¬ 
tive  parameter  sets.  This  capability  will  not  be  required  for  our  purposes,  but  is  covered 
in  [154:129-136]. 

An  extension  to  the  basic  concept  of  parameter  estimation  through  residual  analysis 
and  multiple  model  filtering  is  that  of  “state  reasonableness  checking”  [59,  157].  If  the 
filter  designer  knows  that  a  particular  state  in  a  filter  designed  for  a  particular  parameter 
set  should  have  a  certain  nominal  value,  he  may  heuristically  treat  the  difference  between 
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the  estimated  value  and  the  nominal  value  like  a  residual,  with  variance  given  by  the 
appropriate  element  in  the  filter  covariance  matrix  P(t*).  Modifying  Eqn.  (2.20),  this 
results  in  a  Bayesian  parameter  estimation  format  of  the  form: 

|  z  x  {rin=l[p(zn>*;  I  Zn-l,^«)]}p(^) 

‘  k)  Ei,J=i{n‘=ib(z»,i;  I 

where: 

xTn  =  state  variable  estimates  at  time  tn  for  which  we  wish  to  check  reasonableness 
(generally  a  proper  subset  of  the  filter  state  vector  estimate  x(tn)  -  hence  the  superscript 
r). 

and  other  variables  are  as  defined  earlier.  A  convergent  form  of  this  parameter  estimator 
can  be  defined  analogously  to  Eqn.  (2.21). 

A  further  extension  to  this  basic  multiple  model  estimator  approach  provides  for  the 
parameters  of  the  estimators  themselves  to  change  or  adapt  based  on  observed  a  posteriori 
parameter  class  probabilities  [154:136]  [147].  This  is  the  most  general  form  of  the  multiple 
model  approach,  or  the  true  multiple  model  adaptive  estimator. 

2. 3. 1-4  Nonlinear  Filters  and  Further  Developments.  We  have  seen  that 
the  extended  Kalman  filter  provides  a  means  to  apply  linear  filter  theory  to  systems  with 
nonlinear  state  dynamics  and/or  measurements  of  the  form  given  in  Eqns.  (2.14)  and  (2.15), 
by  successively  relinearizing  about  the  latest  best  estimate.  The  prices  paid  for  this  facility 
are  several:  the  resulting  estimate  is  neither  “optimal”  with  respect  to  every  reasonable 
definition  of  optimality  [153:205]  (e.g.,  the  extended  Kalman  filter  generally  provides  a 
biased  estimate  of  the  true  state  [154:52]),  nor  is  filter  stability  guaranteed  (as  is  linear  filter 
stability  under  nonrestrictive  assumptions  [154:24]),  nor  are  filter  gains  precomputable. 

Consider,  on  the  other  hand,  an  arbitrary  nonlinear  system  of  the  form  specified 
by  the  following  heuristic  (since  its  form  is  incorrect  for  use  in  a  proper  stochastic  inte¬ 
gral)  nonlinear  stochastic  differential  equation  (where  all  variables  have  been  previously 
defined)  [154:159-202]: 
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x  =  f[x(t),u(t),w(t),t] 


(2.23) 


It  can  be  shown  that  solutions  to  this  state  equation  will  not  in  general  be  Markov 
(specifically  we  refer  here  to  a  Markov-1  system):  i.e.,  knowing  the  value  of  x  at  some 
time  tj_i  does  not  provide  as  much  information  about  x  at  some  later  time  U  as  we  might 
obtain  with  knowledge  of  x  at  and  at  earlier  times.  This  failure  to  be  Markov  means 
that  we  cannot  propagate  a  conditional  density  of  the  state  estimate,  such  as  we  did  in 
effect  for  the  linear  Gaussian  case  using  Eqns.  (2.12)  and  (2.13).  Thus,  although  we  can 
use  Bayes’  Ride  to  define  the  change  in  our  estimate  from  a  measurement  at  any  given 
time,  we  cannot  properly  propagate  that  information  forward  to  the  next  measurement 
time. 

However,  for  a  system  which  can  be  described  by  an  Ito  stochastic  differential  equa¬ 
tion  of  the  (rigorous)  form: 

dx(f)  =  f  [x(t),  u (t),  t]dt  4-  G[x(t),  u(t),  f]d/3(t)  (2.24) 

or,  heuristically, 

x(t)  =  f  [x(t),  u(t),  t]  +  G[x(f),  u(t),  f]w(t)  (2.25) 

(where  d(3 (t)  is  the  stochastic  differential  of  a  vector  Brownian  motion  process  and  all  other 
variables  have  been  previously  defined),  then  the  solution  x(t)  is  Markov  (although  not 
generally  Gaussian),  and  it  is  conceptually  possible  to  propagate  the  conditional  density, 
conditioned  on  the  previously  known  state  value,  using  the  forward  Kolmogorov  equation , 
or  Fokker-Planck  equation  [154:192-215].  Given  measurements  of  the  form  in  Eqn.  (2.15), 
Bayes’  rule,  and  the  Chapman- Kolmogorov  equation ,  we  can  theoretically  define  an  opti¬ 
mal  nonlinear  estimator  of  the  conditional  density  of  states,  conditioned  on  the  observed 
measurement  time  history  [154:212-215].  However,  this  optimal  estimator  generally  will 
be  infinite  dimensional,  and  therefore  impossible  to  implement  in  practice. 

However,  if  we  choose  to  approximate  conditional  densities  of  x  with  a  finite  number 
of  moments  or  an  assumed  Gaussian  form  (noting  that  Gaussian  densities  require  but  two 
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moments  for  a  complete  density  function  characterization),  we  can  define  conditional  mo¬ 
ment  estimators  [154:215-238],  higher  order  filters  that  can  be  implemented  in  a  practical 
fashion  (at  least  through  fourth  moments)  to  provide  improved  performance  over  the  ex¬ 
tended  Kalman  filter.  Under  certain  assumptions  and/or  conditions  (ignoring  higher  order 
derivatives,  etc.)  [154:223],  since  Eqns.  (2.5)  and  (2.14)  jure  Ito  stochastic  equations  with 
the  general  form  of  Eqn.  (2.25),  these  higher  order  filters  reduce  to  the  extended  Kalman 
filter  or  even  linear  filters. 

A  particularly  efficient  modification  to  the  extended  Kalman  filter,  providing  gener¬ 
ally  intermediate  performance  between  the  basic  extended  Kalman  filter  and  the  nonlinear 
filters  discussed  above,  is  the  addition  to  the  extended  Kalman  filter  of  so-called  “bias 
correction  terms”  [154:215-238].  These  terms  Eire  in  fact  key  terms  from  the  higher  order 
filter  expressions.  In  every  case  noted  so  far,  however,  improved  performance  is  gained 
only  at  the  expense  of  additional  computation.  Readings  by  the  author  appear  to  show 
in  general  that,  thus  far  in  practice,  systems  designers  find  the  basic  extended  Kalman 
filter  adequate  for  most  purposes,  and  do  not  incur  the  expense  of  going  to  higher  order 
nonlinear  filters. 

A  number  of  other  routes  to  nonlinear  filters  are  available,  as  discussed  in  [154:239- 
259].  In  particular,  recent  work  by  Bishop  [32]  demonstrated  application  of  a  geometric 
nonlinear  filter  to  aircraft  tracking.  His  effort  is  of  special  interest  because  this  author’s 
extensive  literature  survey  (see  previous  references  in  this  section)  indicates  that  Bishop’s 
work  is  perhaps  the  only  fundamentally  new  approach  to  kinematic  measurement-only 
aircraft  tracking  for  fire  control  (gun  aiming)  in  the  last  ten  years.  This  effort  is  discussed 
in  more  detail  in  App.  C. 

2.3.2  Object  Tracking  with  Kinematic  Measurements  Only.  Prior  to  the  landmark 
effort  by  Kendrick  et  al.  [120,  121],  object  tracking  filters  or  target  state  estimators  were 
designed  to  use  only  “kinematic”  measurements  -  defined  here  as  in  Chapter  I  to  be 
measurements  of  the  translation  of  the  object  centroid  through  physical  (generally  three- 
dimensional)  space.  This  information  took  the  form  of  measurements  of  object  range,  range 
rate,  pointing  angle,  and  angle  rate,  as  provided  to  varied  extents  by  radar,  passive  optical 
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or  infrared  tracking  systems,  or  laser.  An  excellent  historical  discussion  of  the  material 
presented  here  is  found  in  the  1984  work  by  Chang  and  Tabaczynski  [50]. 

2.3.2. 1  Common  Filter  Models  for  Tactical  Targets.  Given  that  an  ex¬ 
tended  Kalman  filter  has  been  selected  for  a  particular  tracking  problem,  the  designer 
is  faced  with  three  decisions  or  choices  -  reference  frame  in  which  to  implement  the  fil¬ 
ter,  object  (behavior)  model,  and  measurement  noise  assumptions.  We  will  address  the 
more  common  choices  for  each  in  turn,  again  drawing  heavily  from  a  previous  work  by  the 
author  [143]  -  a  recent  (1990)  work  giving  more  detail  is  [35]. 

The  two  principal  frames  considered  for  airborne  tracking  are  the  line-of-sight  (LOS) 
or  antenna  frame,  and  the  inertial  or  inertial  measurement  unit  (IMU)  frame.  The  LOS 
frame  must  be  further  classified  as  constantly  rotating  with  respect  to  inertial  space  or 
space- stabilized  but  impulsively  realigned  between  measurements  [229].  For  radar  tracking, 
the  LOS  frame  is  easily  related  to  the  “measurement  coordinate”  frame  [35:183]  in  which  a 
tracking  problem  may  be  directly  posed  in  the  m-dimensional  measurement  vector  space. 
Variations  in  IMU  frame  mechanizations  depend  primarily  on  the  Cartesian  frame  with 
respect  to  which  the  designer  desires  to  define  accelerations.  Common  choices  of  inertial 
reference  frame  include  the  locally-fixed  North-East-Down  site  or  navigation  frame,  as  for 
ground  or  local  air  vehicle  tracking,  or  the  earth-centered  geocentric  frame  (fixed  with 
respect  to  the  stars),  as  for  satellite  tracking. 

Since  object  dynamics  are  more  easily  expressed  in  an  inertial  frame,  the  use  of  an 
inertial  frame  often  makes  the  object  state  equation  (Eqn.  (2.14))  more  tractable,  and, 
with  certain  dynamics  models  may  even  make  it  linear,  as  in  Eqn.  (2.5).  With  radar  mea¬ 
surements  of  range  and  (small)  error  angles  in  azimuth  and  elevation,  and  perhaps  range 
and/or  angle  rates,  the  LOS  frame,  on  the  other  hand,  can  provide  a  linear  measurement 
equation  (errors  induced  by  LOS  frame  implementations,  and  their  compensation,  are 
discussed  in  [35:184-188]).  Linear  relations  are  desirable  in  extended  Kalman  filter  formu¬ 
lation  where  possible  because  they  reduce  the  computationally  expensive  need  to  compute 
partied  derivatives.  In  usual  practice,  the  computationally  more  attractive  filter  design  is 
often  one  with  linear  state  dynamics  and  nonlinear  measurements.  This  arises  because 
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the  dynamics  relations  are  generally  computed  much  more  often  in  an  extended  Kalman 
filter  than  the  measurement  relations  -  to  maintain  high  accuracy  in  numerical  integration, 
dynamics  are  propagated  sequentially  over  many  subintervals  of  the  measurement  sample 
period.  This  is  notably  often  the  case  for  radar  tracking. 

The  choice  of  an  appropriate  model  for  object  dynamics  must  be  based  on  a  careful 
study  of  what  an  object  is  likely  to  do.  Perhaps  the  most  popular  tactical  target  dynamics 
model  is  the  Gauss-Markov  Acceleration  model,  as  proposed  by  Singer  [202].  This  model 
assumes  an  exponential  correlation  in  time  between  the  accelerations  in  any  direction  in 
inertial  space  -  the  state  equations  for  each  spatial  dimension  are  of  the  following  form: 

0  1  0  pXT/t  0 

=  00  1  vXT/l  +  0 

0  0  — 1/t  aXT/I  w 

where: 

p,v,  a  =  (respectively)  position,  velocity,  and  acceleration 
*  =  arbitrarily,  one  of  three  orthogonal  directions  in  the  chosen  inertial  frame 
T/I  =  indicator  that  quantities  are  for  the  target,  relative  to  the  inertial  frame 
i  —  indicator  that  quantities  Eire  coordinated  in  inertial  frame  coordinates 
r  =  correlation  time  for  target  accelerations 

w  =  zero-mean  white  Gaussian  noise  of  strength  Q,  as  discussed  in  Sect.  2.3.1 

Note  that  the  acceleration  in  the  above  equation  is  the  output  of  a  first  order  lag 
driven  by  white  Gaussian  noise.  Simpler  models  may  provide  for  (1)  zero  acceleration 
(constant  velocity)  targets,  or  (2)  acceleration  as  a  continuous  white  Gaussian  noise  process 
(interpretable  as  equivalent  to  the  first  model  but  with  pseudonoise  added  to  allow  for 
filter  tuning),  or  (3)  acceleration  as  the  integral  of  white  Gaussian  noise  (a  Brownian 
motion  process)  of  given  strength.  These  modelling  possibilities  and  attendant  cautions 
are  discussed  from  a  general  but  tracking-relevant  perspective  in  [153:180-185].  Some  of 
these  and  other  models,  including,  for  example,  piecewise  constant  (in  time)  acceleration, 


(2.26) 
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Acceleration  (A) 

Figure  2.3.  The  Singer  Model  for  Acceleration  Probability  Density 

with  each  piecewise  constant  value  found  as  the  realization  of  a  discrete  white  Gaussian 
noise  process,  are  discussed  in  [10:82-88]. 

Using  the  Singer  approach  as  in  Eqn.  (2.26),  the  designer  must  specify  the  correlation 
time  r  and  the  strength  Q  (a  scalar  in  any  one  dimension)  of  the  white  Gaussian  noise.  As 
discussed  in  [143],  the  correlation  time  is  inferred  directly  from  studies  of  target  maneu¬ 
verability  such  as  [95] .  Given  a  choice  for  r,  and  a  zero- mean  Gaussian  approximation  for 
the  magnitude  of  the  target  acceleration  (i.e.,  a  standard  deviation  a),  the  designer  can 
define  Q  =  (2/r)<r2  readily  as  shown  in  [153:178]. 

Recognizing  that  no  real  target  has  a  Gaussian  acceleration  density  (since  there 
is  no  upper  bound  on  the  acceleration  magnitude  for  a  Gaussian  density),  Singer  [202] 
defined  a  from  the  second  moment  of  a  symmetrical,  zero-mean  density  based  on  discrete 
probabilities  of  some  maximum  positive  or  negative  acceleration  or  zero  acceleration,  with  a 
uniform  probability  density  for  accelerations  between  the  maximum  and  minimum  values. 
This  density  is  shown  in  Fig.  2.3,  where  A  represents  vehicle  acceleration,  Pm  ax  is  the 
discrete  probability  of  some  maximum  acceleration  Amaxj  and  P0  is  the  discrete  probability 
of  no  acceleration.  This  probability  density  function  and  associated  formulas  are  shown 
in  [202,  143,  35,  120]. 
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Another  popular  dynamics  model  for  tactical  targets  is  the  Constant  Turn  Rate 
model,  based  on  the  assumption  that  the  target  performs  planar,  constant  turn  rate, 
constant  speed,  or  circular,  maneuvers.  This  model  was  proposed  for  ground  targets  by 
Burke  in  1978  [140:265]  and  independently  for  air  targets  by  others  around  the  same  time, 
and  investigated  for  air  targets  by  Maybeck  and  Worsley  [156,  229],  An  apparently  related 
target  model  concept  proposed  by  Bishop  [32]  is  his  “ coordinated  turn”  model,  which 
provides  for  an  aircraft  target  to  make  planar  turns  with  constant  lift  and  “longitudinal” 
(thrust  less  drag)  force  magnitudes. 

Finally,  we  consider  measurement  noise  models.  For  most  radar  tracking  applications, 
this  issue  reduces  in  practice  [35:148]  to  a  selection  of  measurement  noise  covariance  R  for 
discrete  time  white  Gaussian  noise  added  to  position  and  velocity  measurements  of  the 
form  in  Eqn.  (2.15).  It  should  be  noted,  however,  that  radar  measurement  noise  is  in  fact 
often  highly  time-correlated,  since  the  noise  is  a  function  of  the  spatial  relationships  of 
the  tracking  radar  rind  the  target  scatterers  [143]  [35:161-162]  -  relationships  which  are 
time- correlated  by  kinematics.  One  option  in  this  case  is  to  augment  the  state  model  with 
additional  noise  states  as  discussed  in  Maybeck  [153:180-185]  -  the  fundamental  problem 
(for  a  conventional  radar  tracker)  here  is  that,  to  the  extent  that  the  noise  state  “dynamics” 
model  resembles  the  target  dynamics  model,  false  aimpoint  motion  from  noise  and  true 
target  motion  are  indistinguishable  (a  classic  state  estimation  observability  problem). 

2. 3. 2.2  The  a-/3  and  a-/3-j  Filters.  The  a-/?  and  a-/3- 7  tracking  filters 
appeared  respectively  in  articles  by  Benedict  and  Bordner  [24]  in  the  year  1962  and  Simpson 
in  1963  [201].  Neal  [171]  showed  in  1967  that  these  filters  could  be  interpreted  as  steady- 
state  Kalman  filters.  Subsequent  developments  can  be  traced  in  [84,  85,  50,  10]. 

The  a-fi  and  a-(3- 7  filters  are  distinguished  from  Kalman  filters  in  that  measure¬ 
ments  and  states  in  each  of  two  or  three  dimensions  (as  required  for  earth  surface  or 
submarine/aerospace  tracking,  respectively  -  henceforth  we  will  consider  only  the  three 
dimensional  case)  are  considered  independently,  and  measurements  are  provided  for  posi¬ 
tion  only,  corrupted  by  discrete  zero-mean  white  Gaussian  noise.  Since  the  dimensions  are 
considered  independently,  the  nonlinear  measurement  equation  option  as  used  in  the  ex- 
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tended  Kalman  filter  (see  Eqn.  (2.15))  has  no  counterpart  in  the  the  a-(3  and  a-f3- 7  filters. 
Thus,  the  a-/3  and  a-/3-j  filters  prescribe  independent  measurement  equations  for  each 
spatial  coordinate  (orthogonal  x-y-z  or  range-azimuth-elevation)  of  the  form  Eqn.  (2.6), 
where  H  =  [1  0  0],  and  the  update  equation  is  written  as  in  Eqn.  (2.9),  with: 

=  (2.2T) 

where  T  =  (f<+1  -  t{)  is  the  measurement  update  time  interval,  and  K(^)  is  written  as 
a  transposed  column  vector.  For  particular  stochastic  driving  noise  assumptions,  explicit 
equations  have  been  derived  for  analytical  or  numerical  solutions  to  a,  (5,  and  (where 
applicable)  7  (see  refs,  in  previous  paragraph). 

Considering  the  hierarchy  of  choices  available  for  target  tracking  (the  extended 
Kalman  filter,  the  a-/3  and  a-/3- 7  filters,  and  finite  memory /fading  memory  filters  [154]), 
Chang  and  Tabaczynski  [50]  found  the  extended  Kalman  filter  in  general  to  be  the  best 
choice  if  computational  requirements  are  not  too  severe,  followed  in  desirability  by  the 
a-/3  and  filters  for  use  in  cases  where  computation  is  severely  limited  but  degraded 

performance  is  acceptable.  The  author  does  not  intend  to  make  use  of  these  filter  forms: 
a-/3  and  a-/3- 7  filters  are  mentioned  here  only  because  they  were  encountered  in  related 
works  by  other  authors,  to  be  discussed  later  in  this  chapter. 

2.3.3  Tracking  with  Kinematic  and  Signature  Measurements.  Developments  in 
this  relatively  new  field  have  taken  two  fundamentally  different  directions  -  these  may 
be  characterized  as  solving  problems  in  (1)  fire  control ,  predicting  future  position  of  a 
moving  target  to  provide  a  gun  aiming  solution,  or  (2)  observation-to-track  assignment , 
as  performed  in  a  target  acquisition  and  surveillance  system  to  assign  observed  targets  to 
existing  tracks.  Both  approaches  will  be  discussed. 

The  factor  common  to  all  techniques  discussed  in  this  section  is  that  they  fuse  kine¬ 
matic  and  feature  observable  information  for  tracking,  or  kinematic  (principally  transla¬ 
tion)  state  estimation.  Other  tracking  techniques  using  dynamic  programming  will  be 
discussed  in  later  sections  of  this  chapter.  In  the  next  chapter,  these  existing  approaches 
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can  be  contrasted  to  the  author’s  research,  which  combines  kinematic  and  feature  observ¬ 
able  information  using  approaches  from  this  section  and  dynamic  programming  techniques 
for  object  or  target  recognition. 

2.3.3. 1  Fire  Control  -  Approach  by  Kendrick,  Maybeck,  and  Reid,  and  Fur¬ 
ther  Developments.  All  developments  in  melding  kinematic  and  feature  observable 
measurements  for  fire  control  appear  to  stem  from  an  observation  by  Reid  [120:iii]  in  the 
mid-1970’s  that  target /aspect  angle  classifiers  could  help  predict  the  motion  of  an  aircraft 
target  by  identifying  the  orientation  of  the  plane  of  the  wings  relative  to  the  velocity  vec¬ 
tor.  Kendrick,  Maybeck,  and  Reid  [120,  121]  then  developed  a  tracker  that  employed  two 
coupled  Kalman  filters:  one  filter  providing  a  target  orientation  estimate  based  on  imaging 
sensor  measurements,  the  other  filter  providing  kinematic  state  estimates  from  kinematic 
measurements  and  (from  the  orientation  filter)  direction  (but  not  magnitude)  of  “normal 
load  acceleration”  (acceleration  from  lift,  normal  to  the  velocity  vector). 

Subsequent  developments  of  the  fire  control  problem  were  made  almost  exclusively 
by  Andrisani  e<  al.  [5,  4]  and  Sworder  et  al.  [140,  141,  208,  209],  Contributions  by  An- 
drisani  et  al.  were  twofold:  (1)  an  improved  dynamics  model  for  conventional  aircraft  that 
employed  the  coordinated  turn  assumption  and  included  both  kinematic  states  and  aspect 
angles  in  the  extended  Kalman  filter  state  equations,  producing  one  filter  rather  than  two 
coupled  ones,  and  (2)  extension  of  the  dynamics  and  measurement  models  to  the  case  of 
a  helicopter  target.  In  particular,  the  Andrisani  efforts  use  image-derived  orientation  to 
estimate  not  only  direction  of  normal  load  acceleration  but  also  its  magnitude.  Sworder  et 
al.  investigated  designs  for  improved  tank  gunnery  against  moving  ground  targets  (prin¬ 
cipally  tanks),  where  the  observation  by  Reid  noted  above  does  not  apply.  Notably,  the 
original  estimator  form  by  Kendrick  et  al.  continues  to  be  employed  in  current  research, 
as  seen  in  recent  work  by  Dayton  et  al.  [63]. 

With  reference  to  the  discussion  in  Sect.  2. 3. 1.2  on  acceleration  inputs  by  the  aircraft 
pilot,  the  motivation  behind  both  the  Kendrick  and  Andrisani- type  estimators  was  really 
input  estimation  for  aerodynamic  forces  -  those  forces  being  most  significant  with  respect 
to  changing  the  curvature  of  the  trajectory.  Differences  between  the  estimated  acceleration 
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and  the  true  acceleration  were  represented  by  linear  white  Gaussian  noise-driven  processes 
analogous  to  those  discussed  in  Sect.  2.3.2.  Further  development  of  this  concept,  apparently 
not  yet  pursued,  might  explore  sensor-tracker  issues  arising  from  sudden  changes  in  thrust 
(e.g.,  changes  from  cruise  to  afterburner  thrust  or  back,  as  revealed  by  sudden  changes  in 
target  infrared  intensity). 

Eagle  et  al.  [75]  have  proposed  and  investigated  a  version  of  the  Andrisani  estimator 
using  regression  dynamics  techniques  to  estimate  target  parameters  for  improved  tracking 
and  possibly  target  recognition.  In  Chapter  IV  the  author  will  propose  a  third  form  of 
kinematic/aspect-angle  filter  for  use  in  target  recognition,  using  classical  multiple  model 
residual  analysis-based  parameter  estimation  techniques  [154:129-136]  to  provide  proba¬ 
bilities  of  target  class  membership. 

Dynamics  and  measurement  equations  for  the  Kendrick  and  Andrisani  estimators 
are  presented  in  App.  C.  The  form  of  the  Kendrick  estimator  is  illustrated  in  Fig.  2.4. 

The  unconventional  structure  of  the  Kendrick  estimator  makes  it  necessary  to  provide 
a  short  explanation  of  the  sequencing  of  its  operations.  Basically,  at  measurement  times, 
the  aspect  angle  filter  is  provided  with  two  measurements  -  a  “pose  estimate”  provided  by 
an  image  processor,  and  a  “pseudo-measurement”  calculated  from  the  propagated  target 
kinematics  (discussed  in  the  following  paragraph).  The  aspect  angle  filter  then  outputs 
an  estimate  of  the  target  aspect  angle.  Subsequent  processing  in  effect  rotates  this  aspect 
estimate  around  the  target  body  pitch  axis  (nominally,  the  axis  of  the  wings  -  see  Fig.  5.23) 
back  toward  the  velocity  vector  by  an  angle  of  attack  calculated  from  the  lift  magnitude 
(derived  in  turn  from  the  kinematic  estimate,  as  in  the  following  paragraph).  The  resulting 
orientation  of  the  plane  of  the  wings  is  taken  to  indicate  the  direction  of  the  normal 
aerodynamic  forces  (lift,  under  these  assumptions).  This  direction  information  is  provided 
to  the  kinematic  filter,  which  also  receives  classical  range,  range  rate,  angle  (i.e.,  sensor- 
target  pointing  angle),  and  angle  rate  measurements,  as  provided  by  a  radar  and/or  some 
other  sensor  suite.  The  kinematic  filter  provides  an  updated  kinematic  estimate,  which  is 
propagated  conventionally  to  the  next  update  time,  and  the  cycle  begins  again. 
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Figure  2.4.  The  Kendrick- Type  Kinematic/ Aspect-Angle  State  Estimator 


The  pseudo-measurement  of  target  aspect  angle  is  based  on  the  assumption  that 
target  acceleration  normal  to  the  velocity  vector  is  due  only  to  lift  and  gravity,  and  that 
there  is  no  sideslip  (component  of  velocity  normal  to  the  body  frame  xb  -  zb  plane  : 
see  Fig.  5.23).  Target  acceleration  normal  to  the  velocity  vector  is  available  from  the 
filter,  the  direction  and  magnitude  of  gravity  are  well  known,  the  normal  acceleration  due 
to  aerodynamic  forces  is  assumed  due  to  lift  only,  and  target  mass  is  assumed  known. 
The  standard  equation  (see  Sect.  5.5.3)  for  lift  magnitude  as  a  function  of  air  density, 
velocity,  coefficient  of  lift ,  wing  surface  area  (the  latter  two  quantities  assumed  known  for 
a  given  target  class),  and  angle  of  attack  then  yields  the  remaining  unestimated  value  - 
the  target’s  angle  of  attack.  Thus  the  velocity  vector,  the  direction  of  normal  aerodynamic 
forces,  and  the  computed  angle  of  attack  now  completely  specify  the  estimated  target 
orientation  relative  to  the  inertial  frame,  or,  with  simple  coordinate  transformation,  the 
pseudo-measurement  of  target  aspect  as  seen  from  the  attacking  aircraft. 
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Completing  the  discussion  of  target  models  begun  in  Sect.  2.3.2,  we  note  that  the 
Kendrick  estimator  uses  a  dynamics  model  for  normal  load  (lift)  acceleration  that,  unlike 
the  Singer  approach  discussed  in  the  earlier  section,  is  biased  to  provide  a  higher  probability 
of  “upward”  acceleration  -  that  is,  acceleration  due  to  positive  lift,  or  positive  Jingle  of 
attack.  This  reflects  the  realistic  desire  of  pilots  to  accelerate  in  a  direction  they  can 
observe,  and  to  take  the  resulting  acceleration  load  down  into  the  seat,  which  is  both  more 
acceptable  to  human  physiology  and  more  easily  compensated,  as  by  a  “G-suit”  and/or 
isometric  exercise-like  exertions. 

This  filter  system  would  be  expected  to  provide  robust  performance  for  trajectory 
changes  due  to  rolls,  for  constant  Jingles  of  attack.  However,  because  Jingle  of  attack  is 
jdways  computed  as  a  function  of  velocity  and  acceleration  magnitude  from  the  kinematic 
filter,  this  system  should  have  problems  responding  to  sudden  normal  load  acceleration 
magnitude  changes  resulting  from  pitch  maneuver/ Jingle  of  attack  changes  by  the  pilot. 
Apparently,  consideration  of  this  issue  prompted  the  subsequent  development  of  the  An- 
drisani  filter  [5]. 

The  form  of  the  Andrisjmi  estimator  is  conventionail,  since  this  estimator  consists 
of  one  rather  thjin  two  coupled  Kalmjut  filters.  The  form  of  the  Andrisani  estimator  is 
illustrated  in  Fig.  2.5. 

A  key  factor  in  both  the  Kendrick  Jind  Andrisjmi  approaches  is  the  assumption  that 
the  atmosphere  is  considered  at  rest  with  respect  to  the  inertijil  frame.  This  is  a  significant 
assumption,  since  many  of  the  other  assumptions  (i.e.,  zero  sideslip  angle)  axe  in  fact 
wind-relative.  To  the  extent  that  target-local  wind  velocity  relative  to  the  inertial  frame 
is  known,  compensation  is  strjughtforward  [79]  -  the  remaining  uncertainty  contributes  to 
a  form  of  directionally-dependent  bias  error  falling  within  the  class  of  errors  which  should 
not  pose  a  great  problem  for  the  proposed  methods. 

In  contrast  to  the  Kendrick  and  Andrisani  methods,  Sworder  [140,  141,  208,  209]  by 
necessity  took  an  entirely  different  approach,  although  his  motivation  was  much  the  same. 
Like  Kendrick  and  Andrisani,  Sworder  wished  to  use  the  imaging  sensor  basically  as  an 
input  estimator  -  estimating  the  lateral  (right-left  steering)  acceleration  input  u (t)  for  the 
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Figure  2.5.  The  Andrisani-Type  Kinematic/Aspect- Angle  State  Estimator 


state  equation  (Eqn.  (2.14))  in  an  extended  Kalman  filter  which  processes  also  the  usual 
kinematic  measurements.  However,  for  the  ground  targets  of  Sworder’s  concern,  simple 
orientation  is  no  longer  sufficient  to  indicate  acceleration  unambiguously  -  to  estimate  lat¬ 
eral  acceleration  here  we  must  measure  the  change  of  orientation  over  time.  Sworder  et  al. 
attack  this  problem  using  the  theory  of  marked,  point  processes ,  as  defined  by  Snyder  [205] . 
Equations  relevant  to  their  approach  are  given  in  App.  C. 

It  should  be  noted  that  use  of  the  Sworder  approach  for  gun  lead  angle  estimation 
against  tank  targets  may  have  a  practical  shortcoming  which  is  not  mentioned  in  the 
published  references  [140,  141,  208,  209].  In  the  case  of  typical  tank-to-tank  engagements 
at  ranges  of  one  to  three  kilometers,  where  the  Sworder  estimator  ultimately  is  trying 
to  calculate  a  lead  angle  for  a  tank  main  gun  trajectory  solution,  it  is  likely  that  the 
target  tank  also  will  be  attempting  to  acquire  and  engage  targets  of  his  own.  This  means 
that  the  target  tank’s  turret  will  be  moving,  usually  under  two-axis  stabilization,  quite 
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independently  in  orientation  from  its  hull  (i.e.,  the  propulsion  system).  Since  modern  tank 
turrets  are  so  large  relative  to  the  hull,  and  since  the  hull  is  more  likely  to  be  hidden  by 
terrain,  it  seems  quite  likely  that  turret  orientation  will  drive  the  image-based  orientation 
estimator.  Therefore,  the  use  of  these  orientation  estimates  for  target  {hull)  trajectory 
estimation  could  be  misleading  -  in  particular  if  the  crew  of  the  target  tank  know  that 
they  can  confuse  enemy  fire  control  by  uncoordinated  motion  of  their  turret  and  hull.  The 
key  point  here  is  that  when  the  relationship  between  kinematics  and  aspect  angle  breaks 
down,  measuring  the  latter  may  not  help  to  estimate,  let  alone  predict,  the  former. 

Concluding  this  discussion  of  target  trackers  fusing  kinematic  and  feature  observ¬ 
able  measurements,  a  series  of  diagrams  is  presented  to  show  how  these  systems  and  their 
functioning  can  be  represented  using  the  concept  of  the  hypothetical  aspect  angle  sphere, 
as  shown  in  Fig.  1.2.  Figure  2.6  illustrates  the  target  information  used  and  produced  by 
Kendrick  et  al.  [120,  121]  and  subsequently  by  Andrisani  el  al.  [5].  Each  circle  represents 
the  boundary  of  a  particular  3-D  target  model  sphere,  the  hypothetical  aspect  angle  sphere 
centered  on  the  defined  centroid  of  a  particular  target  model.  On  the  spheres  are  inscribed 
paths  which  correspond  to  aspect  angle  histories  over  time.  There  are  fundamentally  four 
different  kinds  of  aspect  angle  angle  history  -  true,  feature  observable-based,  kinematic 
estimate-based,  and  multi-sensor  estimate-based.  Marks  recorded  along  the  path  corre¬ 
spond  to  particular  times  at  which  we  synchronize  measurements  and  estimates. 

Consider  the  topmost  target  model  sphere  -  Model  (1)  -  in  Fig.  2.6.  The  path 
marked  with  X’s  (crosses)  is  the  true  aspect  angle  path  -  specifically,  the  path  inscribed 
on  the  surface  of  the  target  model  sphere  at  the  point  of  intersection  of  the  sphere  surface 
and  the  vector  from  the  center  of  the  target/sphere  to  the  sensor  (see  Fig.  1.2).  The 
set  of  (true)  feature  observable  values  corresponding  to  any  point  along  this  path  can  be 
described  uniquely  as  a  function  of  azimuth  and  elevation  angles  relative  to  a  target-model 
body  coordinate  frame.  The  angular  “roll”  dimension  about  the  vector  is  unimportant 
here  because  we  consider  only  (in-plane)  rotation-invariant  feature  observables  (should  this 
target /sensor-relative  “roll”  become  critical  for  a  particular  sensor,  it  could  be  considered 
in  defining  true  or  expected  signatures).  In  any  given  engagement,  this  true  path  defines 
the  set  of  feature  observables  which  would  be  observed  with  perfect  measurements.  Note 
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Fig.  2. 6. a 
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Kinematically-estimated  aspect  path 
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True  aspect  angle  path 

Feature  observable-based  aspect  path 
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Assignment 

Kinematically-estimated  aspect  path 
“Fused”  aspect  angle  path 
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Feature  observable- based  aspect 
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(Pose  estimate  history) 
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Figure  2.6.  Aspect  Angle  Information  Fusion  in  the  Kendrick- Type  Estimators 


2-41 


that,  in  any  single  engagement  (one  target),  there  is  only  one  true  aspect  angle  path  - 
corresponding  to  the  true  model  out  of  (say)  J  possible  models. 

The  path  marked  with  o’s  (circles)  is  the  feature  observable-based  path,  or  pose 
estimate  history.  This  is  the  aspect  angle  path  that  would  be  calculated  (estimated)  for 
any  given  target  model  using  the  set  of  corrupted  measurements  extracted  by  viewing 
the  true  model  over  the  true  path  with  an  actual  sensor.  As  discussed  in  Sect.  2.2.1, 
for  any  given  measured  feature  observable  value,  the  corresponding  aspect  angle  or  pose 
estimate  is  extracted  from  each  model  or  associated  signature  library,  generally  using 
nearest  neighbor  or  lookup  techniques.  Probabilistically,  this  aspect  angle  (state)  estimate 
in  general  represents  argx« {maxjp(z^  |  x°)]},  that  is,  a  maximum  (classical)  likelihood 
estimate  of  the  aspect  angle  x“  for  some  feature  observable  measurement  zf ,  where  the 
superscript  /  denotes  “feature”.  Aspect  angle  estimates  from  kinematics  would  be  used 
at  most  only  to  start  the  search  in  some  acceptable  local  neighborhood  -  to  prevent  the 
algorithm  from  defining  an  obviously  incorrect  aspect  angle,  often  due  for  example  to 
planes  of  symmetry  which  give  nearly  identical  feature  values  for  aspect  angles  differing 
by  180  degrees. 

Note  that,  if  the  (true)  target  model  from  which  the  measurement  is  extracted  is 
identical  to  the  (test)  target  model  from  which  the  corresponding  aspect  angle  is  sought, 
then  there  are  only  three  reasons  for  am  incorrect  aspect  angle  determination:  (1)  pro¬ 
duction  of  ambiguous  (ill-defined,  as  for  the  visual  image  of  a  sphere,  or  noisy,  as  for 
radar  scatterer  interactions)  signatures  at  the  target,  (2)  transmission  noise,  and  (3)  sen¬ 
sor  noise.  Thus,  for  low-ambiguity  signatures  and  low  transmission/sensor  noise,  the  true 
aspect  angle  path  and  the  feature  observable-based  aspect  single  path  on  the  correct  model 
should  not  differ  greatly.  However,  if  the  true  model  and  the  test  model  differ,  that  model 
mismatch  provides  an  additional  source  for  error  in  the  feature  observable-based  aspect 
angle  path,  as  represented  in  the  center  diagram,  Model  (2),  of  Fig.  2.6.  Finally,  one 
earn  envision  cases  (e.g.,  Model  (3)  in  Fig.  2.6)  in  which  severe  mismatch  between  true 
amd  test  models  results  in  a  feature  observable-based  path  that  resembles  a  random  walk 
over  the  surface  of  the  target  sphere,  perhaps  not  a  reasonable  aspect  angle  path  in  any 
sense.  Clearly,  severe  measurement-to-test  model  mismatches  may  provide  grounds  for 
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terminating  determination  of  a  feature  observable-based  aspect  single  path  early  in  the 
process. 

The  path  marked  with  squares  is  the  kinematics-based  aspect  single  path.  This  is 
the  aspect  single  path  that  would  be  expected,  given  kinematic  estimates  as  provided 
by  a  sensor/estimator  suite  measuring  kinematic  vsiriables  only  (principsilly  position  smd 
velocity),  with  additional  information  on  dynamic  limitations  of  each  tsirget  model.  In  the 
aircrsift  tracking  work  discussed  here,  these  dynamic  relations  are  the  basic  coordinated 
turn,  lift  (lift  normal  to  velocity  vector,  nearly  normal  to  wings),  smd  single  of  attack 
(sdpha  nearly  proportional  to  magnitude  of  lift,  inversely  proportionsd  to  speed  squared) 
assumptions  found  in  Kendrick  et  al.  [120:11]  and  elaborated  by  Andrisani  et  al.  [5]. 

The  path  msirked  with  ellipses  is  the  multi-sensor  estimate-based  (“fused”)  aspect 
angle  path  -  sm  aspect  single  path  based  on  some  optimal  estimator  using  both  kinematic 
smd  feature  observable  measurements.  If  we  use  the  correct  model  as  in  the  top  diagrsim 
in  Fig.  2.6,  it  would  be  reasonable  to  assume  that  the  fused  path  will  fall  between  the 
kinematic  path  smd  the  feature  observable  path,  or,  for  smsdl  (or  at  least  unbiased)  sensor 
errors  as  shown,  closer  to  the  true  aspect  smgle  path  than  we  could  achieve  with  kinematic 
estimates  sdone.  But  if  we  choose  the  wrong  tsirget  model,  the  fused  aspect  path  could 
easily  be  a  poorer  estimate  thsm  provided  by  the  kinematic  path  silone.  Thus,  the  Kendrick 
estimator  smd  its  derivates  should  be  very  sensitive  to  proper  choice  of  target  model  -  the 
research  demonstrated  in  Chapter  IV  exploits  this  sensitivity. 

The  preceding  discussion  spsms  the  extent  of  solutions  for  the  fire  control  problem 
by  fusing  kinematic  and  feature  observable  information.  Developments  discussed  below  by 
authors  other  thsm  Kendrick  et  al .,  Andrisani  et  al.,  and  Sworder  et  al.  have  been  directed 
instead  towsird  observation-to-track  fusion. 

2. 3. 3. 2  Observation- to- Track  Assignment:  Approach  by  Bar- Shalom.  Bar- 
Shsdom’s  probabilistic  technique  for  observation-to-track  fusion,  called  Joint  Probabilistic 
Data  Association  (JPDA)  in  its  latest  form,  is  a  method  for  updating  a  set  of  kinematic 
track  files  where  several  observations  can  be  associated  with  any  given  track  [33:299- 
304]  [10,  8].  Fundamentally,  the  method  assigns  to  each  observation  within  some  predic- 


2-43 


tion  “gate”  (i.e.,  a  region  in  space  around  the  nominal  predicted  position  for  the  next 
observation)  a  probability  that  that  observation  is  the  correct  one.  This  probability  is  a 
function  of  the  distance  of  the  observation  from  the  anticipated  new  observation  position, 
as  based  on  propagation  of  the  track  state  vector  from  the  last  update  to  the  current  time. 
Finally,  the  existing  track  is  updated  by  a  “pseudo-residual”  constructed  as  a  linear  com¬ 
bination  of  all  the  residuals  (differences  between  each  observed  possible  position  and  the 
predicted  position)  in  the  prediction  gate,  each  weighted  by  its  probability  of  association 
to  the  existing  track. 

Bar- Shalom’s  writings  discuss  this  fact  only  briefly  [10:229],  but  the  JPDA  structure 
is  inherently  well  suited  to  incorporating  information  from  feature  observable  measure¬ 
ments.  Instead  of,  or  in  addition  to,  weighting  the  probability  of  correct  association  based 
on  spatial  distance  from  each  observation  to  the  predicted  nominal  new  location,  one  could 
weight  each  observation  probabilistically  based  on  its  distance  from  the  previous  track  in 
feature  (observable)  space. 

2. 3. 3. 3  Approach  by  Blackman.  Blackman’s  approach  to  fusion  of  kine¬ 
matics  and  “attributes”  [8:205-209]  [33:376-380]  is  apparently  derived  from  the  approach 
taken  by  Bar-Shalom,  and  represents  one  manner  in  which  Bar- Shalom’s  recommendations 
could  be  implemented.  Blackman’s  approach  is  summarized  by  the  equation  below  (shown 
here  exactly  as  it  appears  in  [8],  except  that  Blackman  uses  a  capital  Z,  which  in  the  nota¬ 
tion  to  be  introduced  in  the  following  section  will  mean  a  measurement  history  of  several 
measurements  over  time,  rather  than  one  measurement,  as  Blackman  intended): 


p  _  /  Ppexp(-d?-/2) 


(P(*j  I  *<)) 


(2.28) 


in  which  is  the  probability  that  observation  j  belongs  to  track  i,  Pp  is  the  probability 
of  detection,  dtJ  is  the  quadratically- weighted  distance  (i.e.,  =  tf  S”1^,  where  t is  a 

measurement  residual  vector)  from  observation  j  to  the  predicted  nominal  location  for  the 
next  observation  of  track  i  in  -common  dimensional  space  (Mi;  is  the  dimension  of  the 
intersection  of  the  measurement  and  track  spaces),  and  is  the  covariance  of  the  estimate 
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for  the  predicted  nominal.  VY,  is  a  “measurement  volume”  difference  parameter  attempting 
to  quantify  the  magnitude  of  the  mismatch  in  spatial  dimension  and  size  between  (1)  the 
space  observed  by  the  sensor  taking  measurement  j  and  (2)  the  space  traversed  by  track  i 
to  be  updated  (for  example,  quoting  Blackman  [8:205-206],  “given  an  angle-only  track  and 
a  radar  measurement  that  includes  angle,  range,  and  range  rate,  Vi7  is  the  extent  of  the 
range  and  range-rate  measurement  space”).  Finally,  P(zj  |  z,)  is  an  “attribute  likelihood 
function,”  giving  the  probability  that  the  feature  observables  for  the  new  observation  could 
have  the  values  Zj  measured  for  observation  j,  given  the  predicted  feature  observable  values 
ii  for  track  i. 

Although  it  appears  that  Blackman  intended  to  use  as  a  weighting  factor  in 
the  Bar-Shalom  fashion,  it  should  be  noted  that  this  quantity  could  be  used  to  define  a 
“nearest  neighbor”  or  “maximum  likelihood”  association  technique  as  well,  assigning  the 
single  observation  j  to  that  track  i  for  which  PtJ  is  greatest.  In  any  case,  Blackman’s 
approach  appears  highly  heuristic:  it  is  related  to,  but  does  not  appear  to  be  explicitly 
derivable  from,  maximum  likelihood  methods. 

2.3. 3-4  Approach  by  Mitzel.  Whereas  Bar-Shalom  and  Blackman  took 
a  probabilistic  or  statistical  approach  to  observation-to-track  fusion,  fusing  information 
from  multiple  observations  into  one  track,  Mitzel’s  approach  [8:297-320]  is  a  classic  near¬ 
est  neighbor  approach.  Mitzel  defined  an  augmented  vector  consisting  of  four  kinematic 
states  -  position  and  velocity  in  each  of  2  dimensions  -  followed  by  a  set  of  m  global 
descriptor  feature  observable  quantities.  The  kinematic  states  are  updated  in  an  a  -  /? 
tracker,  but  the  feature  observable  quantities  are  treated  as  constant  with  no  driving  noise, 
and  updated  using  the  simple  “static  estimator”  form  of  the  linear  Kalman  filter  [153:9- 
15],  under  the  assumption  that  the  feature  observable  measurements  are  corrupted  by 
discrete-time  white  Gaussian  noise. 

For  each  observation  in  a  track  gate  defined  by  the  kinematic  estimate  covariance, 
Mitzel  defines  an  augmented  vector  of  the  same  form  as  that  maintained  for  the  track  itself. 
Propagating  the  track  forward  to  the  measurement  time,  Mitzel  then  simply  determines 
the  “nearest  neighbor”  observation  to  the  propagated  track  vector.  His  distance  metric 
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is  Euclidean  in  the  4  +  m  dimensional  augmented  vector  space,  weighted  only  by  the 
covariance  of  the  vector  elements  (due  to  the  dimensional  independence  assumptions  in  the 
a  —  (3  tracker  structure  and  and  Mitzel’s  assumed  independence  of  feature  observables  and 
kinematics,  the  only  non-zero  off-diagonal  covariance  elements  are  those  relating  kinematic 
states  in  any  one  dimension). 

The  implicit  assumption  of  independence  between  kinematics  and  feature  observables 
made  by  Mitzel  was  viewed  by  the  author  as  a  significant  limitation  to  the  performance 
of  this  algorithm.  The  original  motivation  for  the  start  of  this  research  was  a  desire  to 
improve  upon  the  assumptions  in  the  Mitzel  effort. 

2.3.4  Object  Tracking  /  State  Estimation  -  Conclusion.  With  the  end  of  this  sec¬ 
tion,  this  chapter  has  covered  nearly  all  of  the  previous  or  classical  approaches  to  physical 
object  or  tactical  target  recognition  and  tracking.  Prom  this  point  on,  we  will  introduce 
ideas  that  are  new,  or  at  least  unconventional,  with  respect  to  these  classical  approaches. 
The  eventual  intent  is  to  apply  these  new  ideas  to  dynamic  object  recognition,  in  particular 
for  moving  objects. 

2-4  Sequence  Comparison  by  Dynamic  Programming 

This  section  introduces  the  mathematics  of  sequence  comparison  by  dynamic  pro¬ 
gramming  methods.  A  central  focus  of  this  research  is  the  use  of  dynamic  programming  to 
compare  (1)  measured  sequences  of  feature  observables  from  an  unclassified  object  against 
(2)  candidate  sequences  or  regions  from  potential  or  known  object  classes.  The  output 
of  this  comparison  process  will  be  a  set  of  likelihood  function  values,  or  measures  of  the 
likelihood  that  the  observed  sequence  was  generated  by  each  of  the  candidate  objects. 

Recalling  the  discussion  on  “independent  look”  pose  estimates  in  Sect.  2.2.1,  we  in¬ 
tend  to  show  that  dynamic  programming  sequence  comparison  methods  offer  a  natural 
method  to  restrict  pose  estimate  histories  to  reasonable  aspect  angle  progressions,  in  ac¬ 
cordance  with  other  information  about  candidate  object  classes.  In  the  following  section, 
we  will  see  why  this  restriction  improves  object  recognition. 
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2. 4-1  Dynamic  Programming.  Dynamic  programming  (DP),  sometimes  called  the 
“ shortest  path  algorithm ,”  is  a  technique  developed  by  Bellman  [23]  for  making  optimal 
decisions  with  minimum  effort,  where  we  define  optimal  as  “minimum  cost,”  according  to 
some  specified  cost  criterion,  or  “minimum  expected  cost,”  where  stochastic  effects  result 
in  some  non-zero  probability  that  our  decisions  will  not  be  implemented  as  we  intend  [155, 
71,  67].  In  practical  use,  dynamic  programming  is  often  presented  as  a  tool  to  find  a 
minimum  cost  (length)  path  through  a  sequence  of  discrete  states  at  discrete  times,  as 
shown  in  Fig.  2.7,  where  each  “branch”  in  the  lattice  represents  a  state  transition  carrying 
a  given  cost,  and  the  total  cost  of  a  path  is  the  sum  of  the  cost  of  the  individual  steps. 

It  is  important  to  note  here  that  the  term  “state”  carries  a  distinctly  different  conno¬ 
tation  in  dynamic  programming  than  in  general  estimation  and  control  practice.  A  “state” 
in  DP  is  generally  a  discrete  location  or  point  in  some  generally  continuous,  multidimen¬ 
sional  state  space,  rather  than  a  particular  dimension  in  that  state  space,  as  is  the  usual 
meaning  in  estimation  and  control.  This  distinction  arises  because  DP  is  only  practical 
where  a  continuous  state  space  can  be  discretized  to  define  a  finite  (therefore  countable) 
number  of  locations. 

The  inelegant  (maximum  effort)  method  of  solving  the  problem  presented  in  Fig.  2.7 
would  be  by  exhaustive  search ,  i.e.,  to  determine  the  cost  of  every  possible  path  through 
the  lattice,  and  select  the  path  of  least  cost.  However,  if  our  system  conforms  to  Bell¬ 
man’s  “  Principle  of  Optimality ,”  then  dynamic  programming  may  allow  us  to  determine 
the  minimum  cost  path  without  considering  every  possible  path.  For  backward  dynamic 
programming,  a  classical  tool  in  control  applications  where  we  wish  to  determine  the 
minimum-cost  sequence  of  state  transition  decisions  required  to  reach  a  given  final  point 
from  a  given  start  point,  the  Principle  of  Optimality  states  that  (fro  n  [71:3],  with  modi¬ 
fication  to  emphasize  that  this  is  a  property  that  a  system  must  be  known  or  assumed  to 
have  in  order  to  use  dynamic  programming,  but  may  not  have  in  actuality): 

The  best  (minimum  cost)  path  from  A  to  D  must  have  the  property  that, 
whatever  the  initial  decision  at  A,  the  remaining  path  to  D,  starting  from  the 
next  point  B  after  A,  must  be  the  best  path  from  that  point  to  D. 
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Note:  Each  path  step  has  an  associated  cost. 

Dotted  arrows  mark  minimum  cost  path 
from  A  to  D. 

Diagram  taken  from  [71]  with 
modification. 


Figure  2.7.  A  Minimum  Cost  Path  Defined  By  Dynamic  Programming 
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If  this  is  true,  the  second  (and  last)  key  idea  of  backward  dynamic  programming  is 
that  the  optimal  path/control  sequence  is  solved  by  working  backward  from  the  desired 
endpoint  to  the  beginning,  recording  the  minimum  cost  decision  for  each  state  as  we  go 
(this  recording  process  is  often  referred  to  as  the  setting  of  “pointers”).  At  the  beginning 
point,  we  then  retrace  our  steps  through  the  stages  (generally,  state/time  combinations) 
back  to  the  end  to  find  the  optimal  path. 

A  straightforward  extension  of  backward  dynamic  programming  leads  to  forward 
dynamic  programming  [71:10-11],  the  form  commonly  used  in  sequence  comparison.  For 
forward  dynamic  programming,  executed  in  the  reverse  direction  with  respect  to  the  pre¬ 
vious  paragraph,  the  Principle  of  Optimality  states  (with  reference  again  to  Fig.  2.7): 

The  best  (minimum  cost)  path  from  A  to  D  must  have  the  property  that,  for 
any  point  C  along  that  path  before  D,  the  best  path  from  A  to  C  must  He  along 
that  best  path  from  A  to  D. 

Using  these  rules,  the  challenge  in  using  dynamic  programming  for  any  particular 
appUcation  is  to  determine  the  incremented  costs,  transition  rules  (since  not  aU  paths  or 
transitions  may  be  allowed),  and  (for  stochastic  problems)  transition  probabilities  -  i.e., 
the  probability  that  a  given  transition  will  be  executed,  given  that  that  transition  or  some 
other  one  was  chosen.  Limiting  the  usefulness  of  dynamic  programming,  however,  is  the 
“Curse  of  Dimensionality ,”  [67:41,  75-76]  -  as  the  number  of  possible  states  and/or  times 
grows,  the  number  of  possible  paths  (dimensionahty)  grows  explosively.  Although  dynamic 
programming  requires  the  consideration  of  fewer  paths  than  exhaustive  search,  there  is  still 
a  price  to  be  paid,  and  that  price  can  become  prohibitive. 

2-4-2  Dynamic  Time  Warping  for  Speech  Recognition.  The  technique  of  dyna,  ic 
time  warping  (DTW)  was  born  from  the  observation  that  the  utterance  of  spoken  words  is 
a  stochastic  process  -  no  two  persons  pronounce  the  same  word  identically,  and,  moreover, 
the  same  person  does  not  pronounce  any  given  word  exactly  the  same  on  any  two  occasions, 
due  to  a  variety  of  factors  (level  of  stress,  word  context,  etc.).  It  was  further  reasoned  that 
a  considerable  portion  of  the  difference  between  any  two  utterances  or  realizations  of  a 
given  word  could  be  charged  to  stretching  (expansion)  or  compression  of  portions  of  one 
utterance  relative  to  the  other,  or  possibly  to  addition  or  deletion  of  relatively  minor 
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sounds.  If  the  effects  of  these  differences  could  be  eliminated,  or  “ time  warped ,”  so  the 
idea  went,  the  underlying  similarity  of  the  two  words  could  be  revealed. 

With  this  objective,  the  concept  of  using  dynamic  programming  for  time  warping  and, 
by  extension,  speech  recognition,  first  appeared  in  the  open  press  in  1970-71  in  articles  by 
Velichko  [214]  and  Sakoe  and  Chiba  [192].  Subsequent  research  in  this  field  was  conducted 
and  published  by  Itakura  [115],  White  and  Neeley  [223],  Sakoe  and  Chiba  [193],  and 
(leading  efforts  in  this  country)  Rabiner  et  al.  at  Bell  Laboratories  [182,  183,  168].  Texts 
by  Parsons  [176]  and  Sankoff  and  Kruskal  [195]  are  useful  references. 

Explanations  of  dynamic  time  warping  and  dynamic  programming- based  sequence 
comparisons  in  general  require  the  use  of  a  diagram  like  Fig.  2.8.  Each  axis  of  the  diagram 
represents  a  sequence  of  twelve  elements  that  is  to  be  compared  to  the  other  sequence 
of  twelve  elements  -  the  circles  represent  possible  (but  perhaps  forbidden)  associations 
between  the  elements.  The  nature  of  the  elements  is  completely  general  -  often  they  are 
vectors  of  some  form,  but  all  that  we  will  require  is  that  some  distance  metric  exists  by 
which  one  element  can  be  compared  to  another. 

Often,  each  element  represents  a  discrete,  sampled  data  representation  from  some 
feature  space.  The  features  generally  represent  observable  quantities  due  to  a  physical  (i.e., 
classically  continuous)  process  or  trajectory  in  some  state  space,  where  the  true  location 
in  the  state  space  at  any  time  is  unknown  -  the  distinction  between  the  state  space  of  this 
trajectory  and  the  feature  space  of  the  observables  is  an  important  one,  and  not  always 
clear  in  the  literature  (see  discussion  in  Chapter  I  of  [195]).  Note  that,  in  general,  the  two 
sequences  need  not  contain  an  equal  number  of  points. 

The  “warping”  or  sequence  comparison  is  really  a  process  of  making  associations 
between  individual  elements  in  the  two  sequences,  computing  the  cost  of  each  association 
according  to  the  distance  (measure  of  dissimilarity)  between  the  element  in  one  sequence 
and  the  element  in  the  other,  and  finding  the  set  of  associations  or  matching  path  that 
gives  the  minimum  total  cost  or  distance.  Associations  are  made  subject  to  “continuity 
constraints,”  that  limit,  for  example,  the  number  of  associations  that  can  be  made  from 
one  element  of  one  sequence  to  elements  of  the  other  sequence,  the  number  of  elements 
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in  a  sequence  that  can  be  skipped  (if  any),  matching  paths  that  move  to  the  left  or  down 
(generally,  these  paths  are  prohibited ),  and  so  on.  Continuity  constraints  prevent  undesir¬ 
able  low  cost  associations  between  two  sequences  that  really  have  significant  differences, 
and  they  lower  the  risk  of  incurring  the  Curse  of  Dimensionality  (see  previous  subsection). 

For  example,  in  Fig.  2.8,  the  dotted  lines  represent  global  continuity  constraints,  or 
bounds  on  the  region  of  admissible  paths  -  we  have  chosen  not  to  allow  or  consider,  for  ex¬ 
ample,  association  of  element  a10  with  element  b2.  As  an  example  of  an  undesirable  low  cost 
association  path,  suppose  that  all  of  the  elements  in  sequence  B  =  {bl}b2,  f>3, . . . ,  612}  are 
“closer”  or  more  similar  to  element  ax  than  to  any  other  element  in  A  =  {aj,  a2,  a3, . . . ,  ai2) 
in  some  chosen  metric.  This  means  that  the  (unconstrained)  association  path  between  A 
and  B  will  be  a  vertical  line  segment  through  the  leftmost  column  of  circles.  But  sup¬ 
pose  that  the  analyst  is  absolutely  sure  from  some  other  source  of  information  that  ax 
cannot  possibly  correspond  to,  say,  b3  and  above  -  continuity  constraints  can  prevent  this 
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“impossible”  match  and  force  the  system  to  output  a  warping  path  cost  which  is  more 
representative  of  (if  not  explicitly  giving)  the  probability  that  sequences  A  and  B  arise 
from  the  same  underlying  space. 

Conversely,  the  global  constraints  in  Fig.  2.8  show  that  we  have  chosen  to  require  the 
association  of  the  first  and  last  elements  of  each  sequence  with  the  first  and  last  elements, 
respectively,  of  the  other.  The  local  continuity  constraints  for  this  diagram  are  shown  in 
the  extracted  portion  on  the  right  -  the  arrows  indicate  that  allowable  paths  into  any 
association  cell  can  only  come  from  the  left,  lower  left,  or  lower  cell.  This  simple  local 
constraint  is  the  most  common  of  many  alternatives  [195:136]. 

A  final  form  of  path  constraint  for  which  numerous  alternatives  have  been  proposed 
is  path  distance  weighting.  The  reader  will  observe  that  a  straight  (diagonal)  path  through 
the  array  in  Fig.  2.8  will  contain  fewer  elements,  and  therefore  a  lower  total  path  cost  if 
this  total  cost  is  a  simple  sum  of  individual  association  costs.  For  some  applications,  the 
designer  may  wish  to  relax  this  preference  for  diagonal  paths,  and  may  provide  de- weighting 
or  “edge-weighting”  techniques  to  make  non-diagonal  paths  more  feasible.  For  example, 
considering  the  simple  local  constraint  discussed  above,  one  may  choose  to  weight  costs 
from  horizontal  and  vertical  transitions  by  a  multiplicative  factor  of  while  costs  from 
diagonal  transitions  are  unweighted.  This  form  of  weighting  (discussed  in  [195])  tends  to 
make  warping  paths  that  move  horizontally,  then  vertically  (or  vice-versa)  just  as  feasible 
as  those  that  move  diagonally. 

In  Fig.  2.8,  then,  the  solid  line  represents  the  minimum  cost  sequence  of  associations 
between  the  elements  of  the  two  sequences  A  and  B,  subject  to  these  global  constraints, 
and  possibly  other  constraints  that  need  not  be  specified  explicitly  for  this  example.  This 
is  a  minimum  cost  path  through  the  space  of  associations,  subject  to  transition  constraints, 
and  forward  dynamic  programming  provides  a  natural  approach  to  determine  this  path. 
Note  that  backward  dynamic  programming  would  be  equally  applicable,  but  researchers 
in  this  field  prefer  to  start  associations  from  the  beginning  rather  than  from  the  end  - 
working  forward  allows  one  to  attempt  to  work  in  real  time,  and  word  beginnings  are  also 
generally  more  distinctly  spoken  and  identifiable  than  word  endings. 
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Thus,  for  the  simplest  form  of  local  continuity  constraint,  allowing  diagonal,  vertical, 
or  horizontal  associations,  but  no  additions  or  deletions  (skipping  an  element  of  one  or  the 
other  sequence)  as  shown  in  Fig.  2.8,  the  forward  dynamic  programming  equations  can  be 
written  and  executed  from  left  to  right  as: 

D(Ck)  =  d(ck)  +  MIN[D(C'jt_1)]  (2.29) 


where: 

ck  =  [aj,bi\  is  the  A-th  element  in  a  sequence  of  allowable  associations  of  elements 
from  sequence  A  with  elements  of  sequence  B,  this  particular  association  being  between 
element  aj  and  element  bt 

Ci t  =  {c1)C2,c3, . . . ,  Cjt } ,  the  minimum  cost  sequence  of  associations  leading  to  and 
including  association  ck 

d(ck)  —  the  cost  or  distance  of  association  ck ,  i.e.,  the  distance  in  some  metric  between 
element  a}  and  element  bs 

D(Ck)  =  the  total  cost  of  reaching  and  accomplishing  association  ck  by  the  minimum 
cost  sequence  of  allowable  associations 

It  is  important  to  point  out  that,  for  sequence  comparison  in  general  and  speech 
recognition  in  particular,  there  is  no  body  of  theory  prescribing  optimum  feature  space 
representations,  distance  metrics,  or  path  constraints.  These  choices  have  been  the  sub¬ 
ject  of  much  experimental  research,  without  identification  of  one  particular  “best”  ap¬ 
proach  [176:297-303]  [223:186-187]  [168:634]  [195:125-161],  although  various  researchers 
have  preferences  for  certain  approaches  in  particular  cases,  e.g.  [115]  [195:37-40]. 

The  key  point  here  is  that,  in  contrast  to  dynamic  time  warping,  for  motion  warp¬ 
ing,  or  the  application  of  dynamic  programming  sequence  comparison  in  dynamic  object 
recognition,  the  dynamic  state  restrictions  of  each  object  class  can  determine  natural  rela¬ 
tionships  that  provide  for  continuity  constraints  to  be  defined  analytically.  These  continuity 
constraints  are  just  exactly  the  restrictions  on  pose  estimate  transitions  that  were  motivated 
in  Sect.  2.2.1. 
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2-4-3  General  Sequence  Comparisons.  Apart  from  application  in  speech  process¬ 
ing,  dynamic  programming-based  sequence  comparisons  have  seen  use  (and  independent 
origins)  in  a  wide  variety  of  fields,  as  discussed  in  [195].  These  fields  include  chromo¬ 
some  comparison,  dendochronology  (archaeological  dating  by  tree  ring  analysis),  bird  song 
comparison,  recognition  of  partial  images  (by  Gorman  et  al.  [101],  to  be  discussed  in  the 
following  chapter)  and  so  on. 

An  extensive  literature  search,  however  (see  bibliography),  has  failed  to  find  any  use 
of  such  classical  sequence  comparison  techniques  for  recognition  of  moving  (changing  aspect 
angle)  objects.  Telephone  conversations  with  Gorman  [100]  and  Rabiner  [180]  support  this 
observation.  In  Chapter  III,  it  will  be  shown  that  some  published  dynamic  programming- 
based  techniques  for  trajectory  estimation  [129,  66]  and  object  recognition  [136,  164,  165] 
can  be  viewed  as  sequence  comparison  techniques.  This  research  serves  in  part  to  fold 
these  and  other  methods  into  the  larger  class  of  algorithms  for  sequence  comparison  by  dy¬ 
namic  programming.  Application  of  the  classical  sequence  comparison  techniques  discussed 
above  and  other  aspects  of  this  research  will  considerably  widen  the  range  of  approaches 
for  dynamic  object  recognition  -  any  of  these  sequence  methods  can  provide  significant 
performance  improvements  over  the  conventional  “independent  look”  methods  discussed 
in  Sect.  2.2.1. 

In  this  research,  we  will  use  the  term  “warping  path  region”  or  “space”  when  referring 
to  the  finite  set  of  all  possible  associations  between  elements  of  two  finite  sequences  of 
feature  observable  vectors,  from  which  sets  of  associations  may  be  defined  under  applicable 
rules  to  define  warping  paths.  This  warping  path  space  is  illustrated  in  Fig.  2.8,  for  warping 
of  two  one- dimensional  sequences  (producing  a  “two-dimensional”  warping  path  region  or 
space). 

In  a  key  generalization  to  basic  DTW  sequence  comparison,  we  may  conceive  of  the 
need  to  match  a  one- dimensional  measured  sequence  with  a  two-dimensional  origin  region 
of  possible  sequences  on  some  model.  Fig.  2.9  illustrates  the  three-dimensional  space  of 
allowable  matching  paths  that  results  from  this  matching  of  a  one-dimensional  sequence 
with  allowable  paths  through  a  two  (angular)  dimensional  aspect  angle  region,  as  opposed 
to  the  two-dimensional  space  of  paths  portrayed  in  Fig.  2.8. 
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Fig.  2. 9. a:  Target  Model  Sphere 


Warping  Path  Allows  Off-Nominal  Matches 

Fig.  2.9.b:  Three-Dimensional  Warping  Path  Diagram 


Figure  2.9.  Motion  Warping  in  a  Two-Dimensional  Aspect  Angle  Region 
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The  one  problem  with  this  approach  is  that  we  are  more  likely  to  rim  afoul  of  the 
“Curse  of  Dimensionality”  [23,  71,  155]  by  allowing  for  a  “three-dimensional”  space  of 
paths.  Otherwise,  the  extension  to  this  technique,  using  somewhat  more  elaborate  three- 
dimensional  continuity  constraints,  will  be  straightforward.  The  continuity  constraints 
for  this  approach  -  the  width  of  the  allowable  path  region,  lateral  (as  opposed  to  along 
the  nominal  direction)  path  transition  rules,  and  so  on,  can  be  defined  similarly  to  those 
discussed  above. 

Note  that  each  point  in  warping  path  space  is  really  an  association  between  two  points 
in  the  feature  observable  space.  These  feature  observable  values  may  arise  from  different 
points  in  the  underlying  state  space  (which  will  be  aspect  angle,  in  our  case),  but  the 
association  cost  is  based  only  on  the  difference  in  feature  observable  values.  The  continuity 
constraints  of  classical  sequence  comparison  methods  may  modify  these  association  costs 
to  deter  or  prevent  matches  that  appear  to  be  too  far  away  in  a  state  space  sense  (i.e., 
separated  by  too  many  sequence  elements).  However,  the  sequence  matching  cost  is  not 
explicitly  weighted  by  any  factor  arising  from  other  knowledge  that  may  be  available  about 
transitions  in  the  state  space  which  produces  the  measurements.  To  do  that,  we  will  apply 
the  following  dynamic  programming  state  estimation  algorithm  -  a  unique  form  of  dynamic 
programming  sequence  comparison,  but  one  not  heretofore  associated  with  the  family  of 
classical  DP  sequence  comparison  techniques. 

2-4-4  The  Larson  and  Peschon  (L&P)  Algorithm.  Larson  and  Peschon  (L&P) 
proposed  a  dynamic  programming-based  algorithm  [133]  for  estimating  the  sequence  of 
n  states  or  locations  in  some  space  with  maximum  a  posteriori  or  MAP  probability  of 
producing  an  observed  sequence  of  n  measurements,  conditioned  on  a  priori  information 
about  the  likelihood  of  transitions  in  the  state  space.  They  did  not  motivate  their  work 
as  a  tool  for  object  recognition  working  on  an  aspect  angle  space,  but  we  will  ultimately 
apply  it  in  this  fashion.  We  will  discuss  this  algorithm,  and  later  illuminate  the  relation¬ 
ships  between  it  and  the  classical  dynamic  programming  sequence  comparison  techniques 
discussed  above. 
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Given  a  sequence  of  measurements  Zfc  =  {z1)z2) . . . ,  zfc},  Larson  and  Peschon  de¬ 
sired  to  find  the  sequence  of  states  X*/*  =  {x0/jt,  Xi/*,  x2/t,  •  •  • ,  i*/*}  that  maximized  the 
conditional  probability  density  function: 

p(Xk/k  |  Zk)  = 

MAX  [p(x0,xi,...,x*  |  z1,...,zk)]  =  MAX  [p(Xfc  |  Zfc)]  (2.30) 

X*  Xk 

where  the  term  “MAX”  refers  to  the  operation  of  finding  the  maximum  value  of  the  indi¬ 
cated  term,  over  all  values  of  Xk,  representing  the  sequence  of  states  {x0,  Xi,  x2, . . . ,  x*}. 
The  a  priori  information  as  to  the  likelihood  of  transitions  in  the  state  space,  as  discussed 
above,  is  represented  as  p(x*+1  |  xt)  for  any  choice  of  k.  Note,  as  do  Larson  and  Peschon, 
that  the  intent  here  is  to  estimate  the  entire  sequence  up  to  the  present,  rather  than  simply 
the  present  state  x*. 

Next,  Larson  and  Peschon  were  willing  to  assume  independence  of  measurements 
Zj  from  states  xt  and  measurements  Zj  for  tj  /  tt,  implying,  for  example,  that  the  time 
interval  required  to  take  data  for  one  measurement  is  less  than,  and  synchronized  with, 
the  loiter  time  in  any  one  state,  and  that  the  measurement  instrument  is  independent 
from  event  to  event.  These  assumptions  are  typical  of  discretization  assumptions  made  to 
implement  dynamic  programming  algorithms  in  naturally  continuous  spaces,  or  to  limit  the 
dimensionality  of  a  problem  in  naturally  discrete  spaces  (where  a  “coarser”  discretization 
than  the  natural  level  may  be  assumed  to  reduce  computation). 

These  assumptions  appear  to  be  reasonable  for  object  recognition  algorithms  in  which 
measurements  arise  from  an  aspect  angle  state  space  and  sensors  have  high  “bandwidth” 
relative  to  the  observable  state  transition  processes.  In  other  words,  we  expect  the  as¬ 
sumptions  to  be  appropriate  where  the  sensors  have  shorter  response  times  than  the  time 
intervals  over  which  the  underlying  state  space  is  expected  to  change  enough  to  alter  the 
“mean”  signature  -  effects  on  signatures  due  to  state  changes  over  the  time  required  to 
make  a  measurement  will  be  indistinguishable  from  the  effects  due  to  noise  if  there  were 
no  state  change.  In  Chapter  III,  we  will  discuss  a  particular  case  of  interest  to  us  in  which 
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these  conditions  may  not  apply,  and  an  approach  by  which  the  Larson  and  Peschon  as¬ 
sumptions  might  be  relaxed  to  address  this  issue.  It  will  be  clear  that  the  assumptions 
made  by  Larson  and  Peschon  above  can  be  relaxed  if  required  without  destroying  the  utility 
of  their  basic  approach,  but  at  the  cost  of  increased  dimensionality,  in  that  probabilities  or 
likelihoods  (costs,  in  the  general  dynamic  programming  sense)  become  path  and/or  mea¬ 
surement  dependent  -  this  is  what  Larson  and  Peschon  are  trying  to  avoid  through  their 
assumptions. 

The  extent  to  which  discretization  assumptions  are  reasonable  or  restrictive  in  any 
particular  scenario,  however,  must  be  evaluated  on  a  case-by-case  basis.  This  evaluation 
can  be  conducted  from  first  principles  -  assessing,  for  example  as  we  did  in  the  previ¬ 
ous  paragraph,  whether  or  not  in  any  particular  state  transition  /  measurement  scenario, 
assumptions  like  those  of  Larson  and  Peschon  are  reasonable.  Alternatively  in  test  and  ul¬ 
timately  in  practice,  the  evaluation  will  be  done  empirically  -  obtaining  a  problem  solution 
(a  control  law,  a  target  recognition  algorithm,  etc.)  under  the  discretization  assumptions, 
and  assessing  whether  or  not  that  solution  provides  acceptable  performance  when  applied 
to  states  and/or  measurements  from  a  continuous  or  more  finely  discretized  “truth”  sce¬ 
nario.  In  general,  as  in  sampling  problems,  the  assumption  of  discretization  becomes  more 
physically  reasonable  (if  not  more  feasible  computationally)  as  the  discretization  fineness 
increases.  Where  an  overall  fine  level  of  discretization  leads  to  high  dimensionality  and 
the  “curse  of  dimensionality”  becomes  an  issue,  it  may  be  reasonable  to  limit  high  dimen¬ 
sionality  or  fine  discretization  to  particular  subsets  of  the  space  where  the  optimal  answer 
is  expected  to  he,  perhaps  using  iterative  solution  processes  to  converge  to  an  adequate 
answer  [155:239,  247,  256-257]. 

In  any  case,  their  assumptions  allow  Larson  and  Peschon  to  use  Bayes’  Rule  and 
break  the  maximization  process  into  stages,  making  it  suitable  for  solution  by  dynamic 
programming  with  the  following  equations  (use  of  w’.ich  will  be  discussed  below): 

' 

p(Xk/k  \Zk)  =  MAX  MAX 

Xit  Xfc_! 
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MAX 

x* 


(2.31) 


<  MAX  [p(Xfc  I  Zfc)]  }  =  MAX  7(xfc,  k) 

{  Xfc_i  J  xfc 

which  shows  the  final  step  in  the  process,  a  maximization  of  /(x*,  fc)  over  all  possible  final 
states  x*,  where: 


p(Xt,Zt)  |  _  p(zt  I  xfc)p(xfc  |  Xfc —  1)  p(Xt_1,Zfc_1) 

p(Zfc)  n  *'  p(z*  |  Z*_i)  p(Zfc_1) 

and: 


(2.32) 


I(xk,k)=  MAX  [p(Xfc  |  Zfc)] 

X*_x 

Then,  stepping  theoretically  to  a  hypothetical  k  +  1-st  step: 

fpK+l  I  X,:  +  i)p(xfc  +  1  |  Xfc) 


(2.33) 


f(x*+i,  A;  +  1)  =  MAX 
X* 


p(Xk  |  Zt) 


P(zit+i  I  Zfc) 

or,  equivcilently,  in  the  recnrsive  form  which  is  the  heart  of  the  algorithm: 


(2.34) 


f(x*+1,Jfe  +  l)=  MAX 
x* 


p(Zk+l  I  xt+i)p(xt+1  I  xt) 

p(zfc+i  |  zfc) 


I(xk,k) 


(2.35) 


Since  the  factor  p(z*+i  |  Zk)  is  the  same  for  all  maximizations  made  at  any  time  tk> 
the  actual  maximization  at  any  stage  need  not  be  done  over  the  term  shown  in  brackets  in 
Eqn.  (2.35),  but  rather  only  over  the  expression  defined  by  computing  this  term  without 
its  denominator,  denoted  J*(xfc+1,  *  +  l). 

The  above  equations  are  used  in  a  recursive  forward  dynamic  programming  procedure 
which  works  as  follows  (from  [133]  with  elaboration): 


(1)  Quantize  the  state  space  [x,f]  to  obtain  a  grid  consistent  with  the  accuracy  requirements 
of  the  problem.  The  italicized  words  are  quoted  directly  from  Larson  and  Peschon,  and 
speak  to  the  potential  pitfalls  inherent  in  the  discretization  process.  This  implied  warn¬ 
ing  is  common  to  all  implementations  of  dynamic  programming  for  naturally  continuous 
processes,  as  noted  above. 


(2)  Initialize  the  (forward  DP)  iterative  procedure  by  defining  /*(xo,0)  =  p(x0),  the  a 
priori  probability  density  for  each  possible  discrete  x  at  time  t0. 

(3)  For  each  quantized  state  xt  (i.e.,  each  possible  discrete  x  at  time  £1),  calculate  7*(xi,  1) 
from  Zi  and  Eqn.  (2.35),  with  appropriate  subscript  changes  for  stage  1,  rather  than  k-\- 1. 

(4)  Write  x0(xi,l)  as  the  value  of  x0  for  which  Eqn.  (2.35)  is  maximized  in  the  previ¬ 
ous  calculation  (establishing  “pointers”  which  will  be  retraced  to  find  the  optimum  state 
sequence). 

(5)  Repeat  steps  (3)  and  (4)  at  each  sampling  instant  until  the  fc-th  instant  is  reached.  Each 
such  iteration  is  one  stage.  This  is  the  iterative  forward  dynamic  programming  procedure, 
moving  forward  through  successive  stages  to  completion. 

(6)  Determine  the  modal  trajectory  X*/*  by  first  using  Eqn.  (2.31)  to  find  X*/*  (i.e.,  the 
state  with  highest  probability  of  being  the  terminus  of  the  true  state  sequence)  and  then 
iteratively  retracing  the  pointers  set  up  in  step(s)  (4),  to  find  the  optimal  state  sequence, 
i.e.,  xi/k  =  Xi(xj+i/it,  i  +  1)- 

Thus,  the  Larson  and  Peschon  equations  define  a  forward  dynamic  programming 
algorithm  for  determining  the  one  sequence  of  states  Xfc/fc  that  gives  us  the  maximum 
likelihood  of  generating  a  set  of  measurements  7>k,  given  some  additional  or  a  priori  in¬ 
formation  p(x0)  and  p(x *+1  |  X*)  about  the  likelihood  of  starting  states  and  transitions  in 
the  state  space. 

2.4.5  Relating  Dynamic  Time  Warping  (DTW)  and  the  L&P  Algorithm.  Both 
DTW  and  the  L&P  algorithm  are  DP  sequence  comparison  techniques.  The  fundamental 
difference  between  them  is  that  DTW  does  not  consider  state  (as  opposed  to  feature) 
transitions  that  occur  off  a  single  “one- dimensional”  path  in  state  space.  In  the  usual 
DTW  case,  we  have  little  knowledge  of  the  underlying  state  space  -  only  examples  of 
the  feature  sequences  produced  by  typical  state  trajectories.  Observations  from  one  state 
trajectory  are  simply  compared  to  observations  from  another  trajectory,  and  “warped” 
to  allow  for  an  optimal  match.  DTW  generally  attempts  to  associate  an  element  of  one 
sequence  with  more  than  one  element  of  the  other  sequence,  leading  toward  a  bias  for 
solutions  that  minimize  the  total  number  of  associations. 
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On  the  other  hand,  the  L&P  algorithm  can  use  information  known  a  priori,  or 
aside  from  the  feature  observations,  about  the  likelihood  of  transitions  in  the  state  space. 
This  allows  the  L&P  algorithm  to  “investigate”  more  than  one  state  trajectory.  The  L&P 
algorithm  does  not  attempt  to  match  more  than  one  state  space  point  with  a  given  element 
in  the  feature  sequence,  and  thus  has  no  “arithmetic”  bias  toward  short  paths  in  the  state 
space. 

The  drawback  to  the  L&P  algorithm  is  its  “maximum  likelihood”  (ML)  nature,  in 
the  sense  that,  given  a  set  of  m  observations,  it  finds  the  set  of  m  discrete  states  most  likely 
to  have  generated  the  observations,  subject  to  a  priori  constraints  :  p(x0)  and  p(xfc+i  |  xfc). 
It  may  be,  however,  that  a  state  space  region  exists  which  has  a  higher  overall  probability 
of  producing  the  given  observations,  when  ail  possible  trajectories  over  time  through  that 
region  are  considered.  By  comparison  with  a  region  chosen  by  the  L&P  algorithm,  this 
“better”  region  might  have  many  points  which  are  rather  likely  to  have  generated  the 
given  observations,  while  the  “L&P”  region  has  a  few  well-positioned  points  which  are 
very  likely  origins,  but  many  that  are  quite  unlikely.  The  use  of  DTW  in  such  a  case, 
forcing  each  point  along  a  likely  state  trajectory  to  associate  with  an  observation,  could 
select  the  “better”  region  instead  of  that  selected  by  the  L&P  algorithm. 

Unfortunately,  while  the  L&P  algorithm  can  use  the  (relative)  computational  econ¬ 
omy  of  DP  to  find  the  ML  sequence  of  states  in  a  state  space  of  arbitrary  dimension, 
the  state  space  region  with  highest  probability  of  generating  the  observed  features  can  in 
general  be  found  only  by  exhaustive  search.  A  set  of  nominal  or  a  priori  likely  trajecto¬ 
ries  through  the  state  space  would  provide  a  starting  point  for  such  a  search  with  DTW 
methods.  Clearly,  a  reasonable  choice  for  the  most  likely  trajectory  based  on  a  priori 
information  is  given  by  propagating  the  starting  state  and  transition  information  used  in 
the  Larson  and  Peschon  approach  through  the  state  space,  without  considering  feature  ob¬ 
servations  at  each  step.  In  other  words,  we  start  with  argXo[maxXn  p(x0)]  and  subsequently 
pick  argX|i+i[maxXji+1  p(x*+1  |  x*)]  over  all  tk  of  interest.  Chapters  III  and  V  will  show  the 
application  of  these  concepts  in  object  recognition  applications. 

Like  all  forms  of  classical  sequence  comparison,  the  L&P  algorithm  and  DTW  are 
fundamentally  syntactic  identification  techniques.  Implicitly,  comparing  two  sequences 
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(or,  as  we  will  see  in  the  following  chapter,  comparing  a  sequence  with  a  space  of  possible 
sequences)  using  these  techniques  is  a  syntactic  comparison  process.  We  may  not  be  able 
formally  to  identify  the  root  symbol,  production  rules  and  other  syntactic  descriptors,  but 
comparing  the  sequence  terminals  in  proper  order  implies  a  comparison  in  some  sense  of 
the  syntactic  processes  which  generated  those  sequences.  Interestingly,  in  the  case  of  the 
L&P  algorithm,  we  may  note  that  the  state  transition  likelihood  quantity  p(xfc+1  |  xk) 
and  the  measurement  likelihood  p(z k+1  |  x^+1)  for  all  states  of  interest  are  explicitly  a 
production  rule  in  the  syntactic  sense  -  for  any  one  object  class  or  grammar,  they  tell  us 
how  a  sequence  of  states  becomes  a  sequence  of  measurements  [90:54]  [212:318].  In  cases  of 
interest  to  us,  the  syntactic  root  symbols  will  be  dynamic  object  classes  exhibiting  specific 
state  transitions. 

2.5  Dynamic  Programming  in  Object  Recognition 

The  purpose  of  this  section  is  to  illuminate  previous  efforts  in  object  or  target  recogni¬ 
tion  and  tracking  that  employed  dynamic  programming.  Here  and  in  the  following  chapters, 
these  approaches  will  be  contrasted  with  the  author’s  research. 

2.5.1  Conception  and  Development  of  Motion  Warping.  The  fundamental  in¬ 
spiration  for  the  research  described  in  this  dissertation  was  the  observation  that  dynamic 
programming-based  sequence  comparison  techniques  could  be  applied  meaningfully  to  ob¬ 
ject  recognition  using  sequences  of  sensor  signatures  from  a  turning  object,  where  (1)  those 
signatures  were  a  function  of  the  angular  orientation  of  the  object,  and  (2)  a  close  relation¬ 
ship  exists  between  the  object’s  translational  and  angular  orientation  states.  The  proposed 
concept  is  referred  to  as  “motion  warping”.  In  summary,  the  key  to  this  formulation  is 
that  we: 

(1)  Start  with  a  potential  object’s  underlying  “feature  observable  surface,”  nominally  a 
hypothetical  aspect  angle  sphere  coordinatized  in  some  two-dimensional  (azimuth  and 
elevation)  spherical  coordinate  representation  (but,  like  the  earth’s  surface,  suitable  for 
analysis  as  a  planar  section  over  local  regions),  with  each  (angular)  dimension  coordinatized 
linearly. 
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(2)  “Pre-warp”  a  one  or  two  dimensional  subset  of  that  surface  to  make  it  appear  as  it 
would  to  a  sensor  scanning  it  at  locations  and  rates  estimated  from  kinematics  and  object 
dynamic  restrictions. 

(3)  Compare  the  observed  measurement  sequence  with  the  pre- warped  surface  subset, 
treating  the  comparison  as  one  of  finding  the  most  likely  continuous  path  for  the  ob¬ 
served  sequence  along  the  pre-warped  surface,  allowing  for  additional  warping  by  dynamic 
programming-based  sequence  comparison  techniques  (e.g.,  dynamic  time  warping,  not  to 
be  confused  with  the  pre-warping  process). 

(4)  Select,  as  the  most  likely  object  class  corresponding  to  the  observed  measurements, 
the  potential  object  class  offering  the  closest  match  between  the  observed  sequence  and  its 
pre  warped  surface  subset. 

With  this  basic  plan  in  mind,  the  author  reviewed  relevant  sources  on  multisensor 
fusion,  object  tracking,  and  pattern  recognition  to  determine  if  the  proposed  concept  had 
already  been  investigated.  Eventually,  this  research  included  all  of  the  classified  resources 
available  from  the  Target  Recognition  Technology  Branch  (WL/AARA  [166])  of  Wright 
Laboratory  at  Wright  Patterson  Air  Force  Base,  Ohio. 

The  first  three  of  the  sources  discussed  below  (Kenyon,  Gorman,  and  Amini  et  al.) 
were  among  the  first  to  be  found  and  reviewed,  and  were  readily  eliminated  from  consider¬ 
ation  as  competing  research.  The  subsequent  three  sources  (Larson  and  Peschon,  Barniv, 
and  Kramer)  served  as  inspiration  for  recent  efforts  by  the  author  of  the  final  reference 
(Mieras  et  al.),  whose  methods  approach  but  do  not  encompass  the  research  described  in 
this  dissertation. 

Between  Kramer  and  Mieras,  however,  we  consider  a  much  older  (1978)  reference 
by  Le  Chevalier  et  al.  of  France  whose  concepts  are  very  close  to  those  of  Mieras  et  al. 
and  similarly  approached  ( particularly  in  philosophy ),  but  did  not  encompass,  the  author’s 
concepts.  These  two  final  authors,  Le  Chevalier  and  Mieras,  are  the  only  sources  found 
whose  concepts  resemble  motion  warping  to  any  meaningful  degree,  so  their  efforts  are  of 
particular  interest. 
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In  the  next  chapter,  we  will  see  that  the  efforts  of  Le  Chevalier  and  Mieras  can  be 
posed  as  suboptimal  versions  of  the  methodology  pursued  in  this  research.  This  research 
serves  to  generalize  and  significantly  extend  their  groundbreaking  efforts. 

2.5.2  Previous  Developments  by  S.  Kenyon.  In  a  1988  article  [122]  in  the  Pro¬ 
ceedings  of  the  1st  National  Symposium  on  Sensor  Fusion  [42],  Kenyon  proposed  to  use 
dynamic  programming  with  object  kinematics  and  attributes  in  an  objective  function  to 
match  observations  with  track  files.  Thus,  the  intended  use  of  the  proposed  system  was  as 
for  the  systems  proposed  by  Bar-Shalom,  Blackman,  and  Mitzel  in  Sect.  2.3.3  (see  refer¬ 
ences  there).  Since  the  Kenyon  article  was  not  specific  as  to  how  dynamic  programming 
was  employed,  Kenyon  was  contacted  telephonically  [123]  and  was  asked,  after  a  short  ex¬ 
planation  of  the  proposed  research,  if  his  article  referred  to  a  concept  like  motion  warping. 
He  replied  that  it  did  not . 

2.5.3  Previous  Developments  by  A.  Amini,  H.  Yamada,  et  al.  In  the  course 
of  researching  previous  efforts  to  apply  dynamic  programming  in  pattern  recognition,  a 
number  of  articles  were  found  regarding  efforts  to  match  stationary  images  of  objects  to 
a  prion-defined  images  or  image  maps.  Representative  of  these  are  [70,  3,  230],  These 
techniques  are  primarily  point-to-point  correspondence  techniques  as  discussed  in  App.  B, 
with  dynamic  programming  used  to  establish  a  minimum-cost  association  between  points. 
This  effort  has  no  similarity  with  the  author’s  research. 

2.5.4  Previous  Developments  by  B.  Burg,  J.  Gorman,  et  al.  A  1986  article  by 
Burg  and  Zavidovique  [41]  of  France  proposed  the  use  of  dynamic  programming-based 
sequence  comparison  mathematics  of  the  type  proposed  for  this  research  to  recognize  sta¬ 
tionary  images.  A  1988  article  by  Gorman  [101]  proposed  the  similar  use  of  dynamic 
programming-based  sequence  comparison  techniques  to  recognize  partially  occluded  sta¬ 
tionary  objects.  Basically,  Gorman’s  research  called  for  breaking  up  an  observed  candidate 
object  silhouette  (possibly  partially  occluded)  into  a  set  of  sections,  for  each  of  which  a 
Fourier  descriptor  representation  (analogous  to  a  word)  was  defined.  These  section  rep¬ 
resentations  were  then  compared  to  the  sections  for  full  silhouettes  from  known  object 
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classes  and  orientations.  The  object  was  to  find  that  full  silhouette  which  best  matched 
the  observed  silhouette  over  most  of  its  perimeter,  with  non-matching  portions  ascribed  to 
partial  occlusion.  Due  to  the  similarity  of  this  effort  with  the  proposed  effort  in  terms  of 
application  and  mathematical  tool,  Gorman  was  telephonically  contacted  [100]  and  asked 
if  his  research  had  included  anything  like  motion  warping.  He  stated  that  it  had  not  - 
in  particular,  he  had  not  considered  the  issue  of  changing  aspect  angle.  So  far  as  can 
be  determined,  also,  the  work  of  Burg  and  Zavidovique  has  no  impact  on  the  author’s 
research. 

2.5.5  Previous  Developments  by  Y.  Bamiv  et  al.  The  developments  by  Y.  Bar- 
niv,  as  originally  published  in  [12,  13]  for  low  signal-to-noise  ratio  object  tracking  using 
dynamic  programming,  are  a  direct  application  and  development  of  the  Larson  and  Peschon 
equations  (Sect.  2.4.4).  Together  with  later  ideas  advanced  by  Weiss  and  Friedlander  [222], 
Barniv’s  concepts  appear  in  their  latest  form  in  [8:85-154]  (note:  the  reader  is  advised  to 
read  Larson  and  Peschon  [133]  before  reading  the  Barniv  works).  The  apparent  intended 
application  of  Barniv’s  concepts,  or  at  least  an  application  which  serves  to  illustrate  the 
concept,  was  as  a  signal  processor  for  a  space-based  imaging  infrared  sensor  attempting 
to  detect  cruise  missiles  over  the  ocean.  The  Bamiv  procedure  is  one  of  a  class  of  what 
are  called  “track  before  detect !”  algorithms,  i.e.,  algorithms  that  use  some  rules  to  string 
together  sequences  of  states  or  points  in  space  as  though  they  constitute  tracks,  and  then 
attempt  to  determine  if  an  object  was  actually  passing  through  any  of  the  sequences  of 
points  at  the  observed  times. 

Bamiv’s  contribution  was  to  define  the  terms  in  the  Larson  and  Peschon  Eqns.  (2.30) 
through  (2.35)  appropriately  for  his  chosen  problem.  In  the  Barniv  development,  one  tracks 
motion  on  a  32  X  32  array  of  imaging  sensor  pixels.  Each  pixel  is  divided  into  4  quadrants, 
called  cells.  Thus  the  image  plane  has  64  X  64  cells.  A  “state”  is  a  pairing  of  any  two 
cells,  a  start  cell  and  an  end  cell,  located  4  pixel  or  8  cell  lengths  apart.  This  distance 
corresponds  to  the  travel  of  a  nominal  object  over  the  pixel  integration  time.  For  any  given 
cell,  there  are  164  states  corresponding  to  trajectories  in  all  possible  directions.  Thus,  on 
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the  image  plane,  there  are  64  X  64  X  164  =  671,744  total  states.  A  stage  in  the  dynamic 
programming  sense  is  defined  by  all  possible  states  over  the  pixel  integration  time. 

The  discrete  probability  density  p(xt+i  |  xfc)  is  defined  through  any  one  of  several 
approaches  as  a  monotonically  decreasing  function  of  the  absolute  value  of  the  curvature 
of  the  trajectory  -  that  is,  reflecting  the  fact  that  cruise  missiles  are  most  likely  to  travel 
in  straight  lines.  Since  a  given  state  or  straight  line  trajectory  segment  passes  over  given 
rectangular  cells  with  generally  different  dwell  times  in  each,  this  factor  is  taken  into 
account  when  defining  p( z*  |  xfc)  -  the  probability  or  likelihood  of  obtaining  the  observed 
measurement  from  a  set  of  cells,  given  that  the  trajectory  passed  through  that  set  of 
cells.  Basically,  then,  the  Larson  and  Peschon  procedure  is  repeated,  with  each  new  set 
of  integrated  image  frames,  starting  with  every  cell/ direction  combination  on  the  image 
plane  as  a  possible  x0,  looking  for  generally  straight  tracks  over  a  growing  number  of  image 
frames,  and  identifying  the  latest  endpoints  of  potential  tracks  from  states  (effectively, 
terminal  cells  with  some  inbound  direction)  that  have  a  value  of  /*(xt,  k)  greater  than 
some  threshold. 

2.5.6  Previous  Developments  by  Kramer  and  Demirba§.  The  concepts  proposed 
by  Bamiv  were  subsequently  adapted  by  Kramer  and  Reid  [129]  for  “ track  before  detect ’ 
processing  with  doppler  radar.  In  the  Kramer  development,  the  states  are  “ range-azimuth - 
doppler ”  cells,  i.e.,  discretized  range  and  azimuth  cells  that  are  further  subdivided  according 
to  doppler  velocity  bins  (elevation  is  not  considered  explicitly,  so  the  physical  tracking  space 
is  apparently  two-dimensional).  Thus,  Kramer  starts  with  each  range-azimuth-doppler  cell, 
performing  a  forward  dynamic  programming  procedure  after  each  scan  to  determine,  for 
each  cell  with  some  minimum  acceptable  return  level,  what  the  most  likely  predecessor 
cell  was.  The  analog  of  Larson  and  Peschon’s  I'(xk,k)  parameter  here  is  Kramer’s  track 
“ quality  count,"  which  is  not  a  probability,  but  which  serves  to  identify  likely  object  tracks. 

Demirba§  [66]  has  (apparently  independently)  developed  a  similar  concept  for  radar 
tracking.  The  author  has  not  analyzed  his  concepts  in  detail. 
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2.5. 7  Previous  Developments  by  F.  Le  Chevalier  et  al.  Several  months  after 
the  definition  and  development  of  the  motion  warping  concept,  which  included  extensive 
research  covering  the  time  period  from  1980  on,  and  which  indicated  no  similar  efforts  in 
radar  or  imaging  sensor  object  recognition  or  computer  vision  [14,  10,  8,  28,  33,  40,  55,  94, 
96, 123, 166, 169, 180,  191,  218,  224],  the  author  conducted  additional  research  into  the  pre- 
1980  time  frame  on  a  related  concept.  While  reviewing  the  Proceedings  of  the  1978  IEEE 
Conference  on  Pattern  Recognition,  the  author  found  an  article  by  Francois  Le  Chevalier 
of  O.N.E.R.A.,  Paris,  France  [136]  which  included  several  of  the  same  observations  and 
ideas  as  those  behind  the  author’s  concept  of  motion  warping.  Specifically,  the  key  points 
of  the  Le  Chevalier  article  were  that: 

(1)  Aspect  angle  information  for  aircraft  targets  would  be  available  from  a  radar  tracker  and 
could  be  used  to  fuse  radar  cross  section  (RCS)  measurements  for  improved  identification. 

(2)  A  “shortest  path”  (presumably,  Le  Chevalier  meant  forward  dynamic  programming) 
process  linking  associations  between  (a)  model  outputs  at  various  aspect  angles  and  (b) 
measurements  from  a  target  of  unknown  type  could  be  used  to  determine  the  most  likely 
target  class  to  have  produced  the  observed  sequence  of  outputs. 

(3)  The  transition  rules  in  the  dynamic  programming  process  would  be  driven  by  “evo¬ 
lutionary  constraints”  associated  with  the  target  dynamics  as  observed  by  the  radar  - 
i.e.,  model  aspect  angle  state  transitions  which  did  not  conform  to  feasible  or  observed 
target  dynamics  would  not  be  allowed.  These  constraints  are  evidently  invoked  as  limits, 
or  “yes/no”  decisions,  and,  as  will  be  seen  in  the  following  chapter,  are  therefore  invoked 
more  crudely  than  the  approach  envisioned  in  the  author’s  research. 

(4)  The  proposed  method  could  be  employed  with  sensor  systems  other  than  radar  -  indeed, 
for  any  feature  observable  for  which  a  metric  between  measurements  and  predictions  can 
be  established. 

This  proposal  by  Le  Chevalier  et  al.  was  derived  not  from  an  understanding  of 
speech  processing  and  multisensor  fusion  as  was  the  author’s,  but  from  an  appreciation 
that  a  radar  target  could  be  represented  as  a  finite  state  automaton  producing  sequences  of 
observables  over  time/aspect  angle,  in  accordance  with  evolutionary  constraints  described 
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by  the  target’s  dynamic  limitations.  Undeniably,  the  motivation  and  goal  of  Le  Chevalier 
et  al.  were  the  same,  and  the  methods  are  related,  as  we  will  see  below.  The  research 
described  herein,  however,  was  developed  independently  of  Le  Chevalier’s  work. 

Reviewing  the  discussion  in  [136],  the  Le  Chevalier  technique  is  believed  to  work 
essentially  in  the  following  fashion.  First  of  all,  from  some  (presumably  kinematic)  “a 
priori'  aspect  angle  information,  a  given  (first)  radar  signature  is  known  to  originate  from 
some  relatively  large  aspect  angle  window  on  the  unknown  target,  say  10  to  20  degrees 
square.  The  aspect  angles  in  this  window  are  defined  for  discrete  values  of  length  and 
width  (linear  aspect  angle).  Then,  the  feature  space  metric  matching  distances  between 
the  observed  signature  and  the  model-defined  signatures  Eire  defined  for  each  discrete  aspect 
angle  value  in  the  window,  for  each  model. 

Due  to  the  Chi-square  metric  used  by  Le  Chevalier  et  al.,  this  matching  distance,  for 
any  given  observation-to-model/aspect  angle  comparison,  is  directly  related  to  the  maxi¬ 
mum  classical  likelihood  (probability  density  value)  p(z{  |  u>t,  x£)  that  feature  observation 
z{  was  generated  by  model  Wj  at  discrete  aspect  angle  state  x£. 

Now,  we  consider  the  second  observed  signature,  also  defined  to  have  come  from  some 
nominal  aspect  angle  window  on  the  target,  possibly  a  somewhat  different  window  than 
on  the  last  signature.  Again,  for  each  model  and  window  (at  suitably  discretized  aspect 
angles)  we  define  the  Chi-square  (matching)  distances  to  this  second  observed  signature. 
Next,  for  each  model,  around  each  discrete  aspect  angle  value  (state)  with  a  matching  dis¬ 
tance  less  than  some  threshold  for  the  second  signature,  we  examine  the  matching  distance 
values  for  the  first  signature  from  states  that  lie  within  some  “evolutionary  constraint” 
aspect  angle  bound  or  “association  gate”  (evidently  circular)  around  that  second  signa- 
ture/aspect  angle  match.  We  associate  each  second  signature/aspect  angle  match  with  the 
most  likely  (minimum  matching  distance)  first  signature/aspect  angle  match  lying  within 
the  acceptable  radius.  The  total  matching  distance  associated  with  the  second  state  is 
now  the  sum  of  the  second  and  (minimum)  first  state  distances.  This  process  continues 
over  successive  signatures  for  each  model  until  the  sequence  with  minimum  total  matching 
distance  to  the  final  window  (signature)  is  chosen,  indicating  the  most  likely  model  to  have 
produced  the  observed  measurement  sequence,  and  a  corresponding  aspect  angle  history. 
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Significantly  for  our  later  discussion,  Le  Chevalier  appears  to  forego  explicitly  the 
opportunity  to  “propagate”  the  target  angular  state  according  to  observed  kinematics. 
He  maintains  that  restricting  allowable  transitions  to  known  “evolutionary  constraints” 
will  reveal  the  correct  target  class  and  aspect  angle  path  without  the  need  for  detailed 
knowledge  of  target  kinematics.  As  we  will  see  in  the  following  chapter,  Le  Chevalier’s 
claim  that  his  approach  works  “in  real  time”  for  aircraft  targets  is  further  indication  that 
he  eschews  the  propagation  of  angular  states  according  to  observed  kinematics. 

Subsequently,  the  author  conducted  detailed  research  to  determine  the  extent  to 
which  Le  Chevalier  et  al.  had  developed  the  concept  outlined  in  the  1978  article.  Further 
articles  by  Le  Chevalier  and  references  to  his  work  included  the  following  (no  further  works 
by  his  co-authors  were  found): 

(1)  A  classified  article  published  in  French  as  part  of  the  1980  AGARD  Conference  on  “Im¬ 
age  and  Sensor  Data  Processing  for  Target  Acquisition  and  Recognition,”  which  resulted 
in  AGARD  Conference  Proceedings  CP-290  of  the  same  title  [137]. 

(2)  A  French  patent,  number  2402971,  awarded  in  April  1979  [135],  the  application  (num¬ 
ber  7727362)  for  which  was  referenced  in  the  1978  article.  This  patent  covers  the  ideas 
expressed  in  the  1978  article. 

(3)  A  reference  to  Le  Chevalier’s  1978  article  as  an  application  of  pattern  recognition 
for  target  identification,  without  further  comment,  in  a  1980  overview  on  (then)  recent 
advances  in  pattern  recognition  by  the  well-known  pattern  recognition  expert,  the  late  Dr. 
K.S.  Fu  [89]. 

(4)  A  paper  by  Le  Chevalier  delivered  at,  and  published  in  the  proceedings  of,  the  1984 
International  Conference  on  Radar  (not  IEEE-sponsored)  in  Paris  [138]. 

(5)  A  short  discussion  of  Le  Chevalier’s  1978  article  in  a  generally  excellent  encyclopedic 
Russian  work  [172]  covering  all  forms  of  radar  target  recognition  known  to  have  been 
published  in  the  West.  As  an  aside,  the  referenced  text  is  available  in  English  from  the 
Defense  Technical  Information  Center  (DTIC),  and  is  highly  recommended  for  any  target 
recognition  researcher. 
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(6)  An  article  by  Le  Chevalier  published  in  the  1986  edition  of  La  Recherche  Aerospatiale 
(the  organizational  journal  of  O.N.E.R.A.,  where  Le  Chevalier  was  employed)  [134]. 

(7)  A  reference  to  Le  Chevalier’s  1978  and  1984  works  in  a  chapter  on  target  recognition 
considerations  in  a  1987  book  entitled  Principles  of  Modem  Radar  [76].  The  chapter  was 
written  by  Dr.  N.F.  Ezquerra,  now  (1991)  employed  by  Georgia  Tech  Research  Institute 
in  Atlanta,  GA. 

(8)  A  reference  to  Le  Chevalier’s  1986  work  in  a  1990  Recherche  Aerospatiale  article  by 
another  O.N.E.R.A.  employee  [25]. 

(9)  A  U.S.  patent,  number  4,735,379,  awarded  5  April  1988  [142],  which  refers  to  the  earlier 
French  patent  number  2402971. 

The  author  initially  obtained  the  works  listed  as  (3),  (4),  (6),  and  (7)  above,  and 
telephoned  Dr.  Ezquerra  (see  item  (7))  to  question  him  regarding  the  state  of  development 
of  Le  Chevalier’s  concept.  Neither  item  (4)  nor  item  (6)  discuss  the  subject  of  the  1978 
article  at  all.  Dr.  Ezquerra  stated  [80]  that  he  had  co-chaired  a  panel  with  Le  Chevalier 
at  the  1984  Conference,  and  that,  so  far  as  he  (Ezquerra)  knew,  no  further  development 
of  Le  Chevalier’s  work  has  been  made. 

Ultimately,  the  author  obtained  the  other  items  listed.  Only  items  (1),  (2),  and  (9) 
mention  further  applications  of  the  1978  proposal.  Clearly,  the  patents,  or  items  (2)  and 
(9),  are  of  key  interest.  Item  (2)  does  not  amplify  the  1978  proposal  in  any  significant 
way  as  far  as  this  research  is  concerned.  In  an  extension  of  that  article,  however,  item  (2) 
expresses  the  intent  to  use  this  approach  in  an  application  similar  to  that  of  Kramer  and 
Demirba§,  as  discussed  in  the  preceding  Sect.  2.5.6.  Item  (9)  applies  the  1978  concept  to 
searches  on  the  earth’s  surface,  rather  than  in  aspect  angle  on  a  target  signature  library. 
Again,  this  is  functionally  equivalent  to  the  Kramer  concept,  although  aspects  of  the 
Kramer  concept  appear  to  be  improvements  over  the  Le  Chevalier  approach. 

Item  (1)  refers  briefly  to  the  applicability  of  Le  Chevalier’s  concept  to  HRR  radar 
target  recognition,  but  does  not  develop  the  theory  of  the  concept  further.  The  fact 
that  this  source  is  printed  only  in  French  in  a  classified,  relatively  obscure  (at  least,  very 
unfortunately,  for  Americans)  journal  may  explain  the  fact  that  the  author  has  seen  no 


2-70 


references  to  it  anywhere.  Item  (5)  essentially  repeats  the  1978  Le  Chevalier  article,  but 
misses  the  point  about  the  importance  of  dynamic  programming  to  the  method. 

Noting  Le  Chevalier’s  claim  in  the  1978  article  that  his  work  was  an  original  contribu¬ 
tion  to  the  study  of  syntactic  methods  of  pattern  recognition  as  discussed  by  Fu  in  [88]  and 
other  works,  and  Fu’s  brief  mention  of  the  Le  Chevalier  article  in  Item  (3)  above  [89],  the 
author  reviewed  subsequent  books  on  syntactic/structural  pattern  recognition  by  Fu  [90] 
(pub.  1982)  and  Miclet  [161]  (pub.  in  French,  1984;  in  English,  1986).  Neither  of  these 
extensive  works  contain  any  mention  whatsoever  of  Le  Chevalier’s  efforts.  In  addition,  the 
author  reviewed  the  references  for  all  books  and  articles  reviewed  previously  (including  all 
other  references  in  the  bibliography  of  this  work).  No  other  references  to  the  Le  Chevalier 
work  were  found. 

One  other  reference  was  found,  however,  that  closely  follows  the  Le  Chevalier  ap¬ 
proach,  but  derives  from  a  different  origin.  That  effort  is  the  subject  of  the  following 
section. 

2.5.8  Recent  D(  /elopments  by  H.  Mieras  et  al.  In  reviewing  the  Proceedings  of 
the  1990  Combat  Identification  Systems  Conference  [231],  the  author  found  an  article  [164] 
by  H.  Mieras  et  al.  of  Raytheon  in  which  the  authors  claimed  to  have  made  use  of  dynamic 
programming  for  “integrating”  high  resolution  radar  range  sweeps  over  stable  or  changing 
target  aspect  angles.  Mr.  Mieras  was  telephonically  contacted  [162]  and  questioned  about 
his  efforts.  He  stated  that  their  approach  (considered  proprietary  by  Raytheon)  was  in¬ 
spired  by  the  Kramer  effort  [129]  discussed  above,  but  working  with  aspect  angle  “bins” 
(discretized  values)  rather  than  radar  range/angle/doppler  bins.  Subsequently,  a  more 
detailed  report  on  the  radar  target  recognition  ehort  discussed  in  [164]  was  found  to  be 
available  in  [165],  The  Mieras  effort  is  independent  of,  and  evidently  somewhat  extends, 
the  effort  of  Le  Chevalier. 

Like  the  Le  Chevalier  approach,  the  Mieras  approach  uses  forward  dynamic  program¬ 
ming  to  establish  a  minimum  distance  match  between  a  set  of  signature  observations  and 
library  signatures  stored  as  a  function  of  aspect  angle  for  any  candidate  target  class.  Due 
to  the  Mahalanobis  metric  and  the  range  alignment  technique  used  by  Mieras  et  al.  (see 
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Sect.  2.2.3),  this  distance,  for  any  given  observation-to-model/aspect  angle  comparison, 
is  directly  related  to  the  maximum  classical  likelihood  (probability  density)  p(z{  |  u >i,xj) 
that  observation  z[  was  generated  by  model  at  discrete  aspect  angle  state  x*. 

In  addition  to  Kramer  [129],  Mieras  also  noted  as  references  the  articles  by  Barniv  [12, 
13]  noted  earlier,  and  another  by  Scharf  and  Elliot  [196]  (the  latter  is  a  relatively  broad 
1981  overview  on  dynamic  programming  in  signal  and  image  processing).  Mieras  was  not 
aware  of  the  work  by  Le  Chevalier  [136],  and  did  not  explicitly  mention  that  of  Larson  and 
Peschon  [133]. 

Of  particular  interest  to  the  author  was  the  way  in  which  the  Mieras  method  uses 
a  priori  aspect  angle  data,  derived  for  example  as  we  have  discussed  in  Sect.  2.3.3. 1  from 
kinematics.  Mieras  noted  that  his  aspect  angle  information  was  limited  to  “a  rough  esti¬ 
mate,  say  within  a  10  or  20  degree  window  at  the  time  of  the  sweep”  [162].  He  also  noted 
that  his  algorithm  did  not  establish  in  aspect  angle  path  before  the  processing,  but  that 
one  “comes  out  of  the  algorithm”  [162].  In  response  to  further  questions,  he  noted  that 
constraints  on  sweep  to  sweep  continuity  were  “set”  at  a  particular  angular  value.  More 
recent  communication  with  Mieras  [163]  appears  to  indicate  that  his  algorithm  biases  the 
angular  constraints,  or  a  circular  “association  gate”  like  that  discussed  above  for  the  Le 
Chevalier  approach,  in  the  direction  expected  according  to  observed  kinematics,  but  does 
not  use  formal  aspect  angle  transition  likelihoods,  which  we  will  see  are  available  from  the 
tracking  process. 

These  comments  by  Mieras  and  the  referenced  articles  indicated  to  the  author  that 
the  Mieras  method  does  not  fully  exploit  or  “fuse”  information  available  from  kinemat¬ 
ics,  which  is  the  fundamental  intent  of  this  research.  To  the  level  of  description  available, 
Mieras’  technique  apparently  corresponds  closely  to  procedures  described  for  the  Le  Cheva¬ 
lier  method  -  matching  n  observations  to  n  locations  in  an  aspect  angle  state  space,  subject 
to  kinematic  transition  limitations.  The  steps  taken  by  Mieras  to  bias  angular  constraints 
in  the  direction  expected  according  to  observed  kinematics  and  other  techniques  [162]  ap¬ 
pear  to  make  Mieras  method  an  improvement  over  that  of  Le  Chevalier.  The  differences 
between  tb  ise  methods  and  the  accomplished  research  of  the  author  will  be  highlighted  in 
the  following  chapter. 
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2.5.9  Significance  of  the  Le  Chevalier  and  Mieras  Efforts.  The  core  inspiration 
behind  f  :e  research  described  in  this  dissertation  was  the  observation  that  dynamic  pro¬ 
gramming  sequence  comparison  could  be  a  key  tool  for  multisensor  fusion  and  dynamic 
object  recognition.  Clearly,  Le  Chevalier  and  Mieras  have  independently  preceded  the 
author  in  the  kernel  of  that  observation.  The  author’s  real  contribution,  then,  is  to  de¬ 
velop  further  the  theory  and  methods  for  use  of  dynamic  program  ning  and  other  sequence 
comparison  methods  for  these  purposes.  The  techniques  of  Le  Chevalier  and  Mieras  will 
be  shown  to  be  elements  in  a  family  of  more-or-less  capable  algorithms  for  achieving  a 
particular  purpose  in  multisensor  fusion  -  exploiting  the  joint  likelihood  of  all  observable 
events.  The  author  strongly  believes  that  the  class  of  approaches  conceived  by  Le  Cheva¬ 
lier,  Mieras,  and  the  author  has  not  heretofore  received  the  development  or  application 
that  it  deserves. 


2-73 


2.6  Multisensor  Fusion 


2. 6. 1  General  Overview.  Multisensor  fusion  is  an  extremely  broad  concept  which 
encompasses  all  techniques  for  melding  information  from  one  or  more  sensors  to  make 
better  (i.e.,  more  probably  successful)  decisions  than  could  be  made  using  the  output  of 
individual  sensors  separately.  Taxonomies  for  multisensor  fusion  are  more  diverse  than 
those  for  pattern  recognition,  and  this  work  will  not  delineate  any  -  the  reader  is  referred 
to  any  of  [42,  43,  44,  106,  107,  126,  145,  146,  189,  218,  224].  Prom  [218:1],  we  have  the 
following  definition  for  data  fusion  (equivalent  to  multisensor  fusion  from  our  perspective), 
in  a  military  context: 

(multisensor  or  data  fusion  is)  a  multilevel,  multifaceted  process  dealing  with 
the  detection,  association,  correlation,  estimation,  and  combination  of  data 
and  information  from  multiple  sources  to  achieve  refined  state  and  identity 
estimation,  and  complete  and  timely  assessments  of  situation  and  threat. 

It  should  be  recognized  by  the  reader  that  all  of  the  estimation  and  recognition 
approaches  discussed  thus  far  in  this  chapter  are  sensor  fusion  techniques  -  even  the  lowly 
a-0  tracker,  fusing  information  from  one  sensor  over  time,  can  be  considered  a  sensor 
fusion  device.  In  Sect.  2.2,  we  discussed  fusion  of  feature  observable  information  to  define 
the  pattern  class  and  orientation  for  an  unknown  object.  In  Sect.  2.3.2,  we  discussed  fusion 
of  kinematic  information  to  determine  object  track.  Finally,  in  Sect.  2.3.3,  we  discussed 
fusion  of  kinematic  and  feature  observable  information  to  determine  object  tracks. 

Perhaps  the  one  fundamental  principle  behind  more  forms  of  multisensor  fusion  than 
any  other  is  Bayes’  Rule  [197,  177],  underlying  as  noted  above  a  major  branch  of  pattern 
recognition  as  well  as  neural  net  theory  [190]  and  the  Kalman  filter  [153:209-217].  As  we 
saw  in  Sect.  2.2.1,  however,  the  full  potential  of  Bayes’  rule  can  only  be  reached  where  we 
know  all  possible  classes  of  events  (in  our  case,  objects  and  their  orientations),  the  a  priori 
probability  of  the  occurrence  of  those  events,  and  the  probability  (likelihood)  that  a  given 
actual  event  generates  a  given  observation. 

Lacking  this  a  priori  information,  the  principal  alternative  is  maximum  likelihood 
estimation  or  classification  (e.g.,  Eqn.  (2.2)).  For  example,  if  we  have  no  a  priori  in¬ 
formation  on  object  or  orientation  probability,  but  know  all  possible  classes  of  objects 
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and  have  the  classical  likelihood  function  p(z  |  a/,-)  that  each  candidate  object  u>{  gener¬ 
ates  any  feasible  observation  z,  we  may  assign  the  unclassified  object  to  the  class  which 
maximizes  that  classical  likelihood  function  for  the  actual  observed  z.  Maximum  like¬ 
lihood  classification  can  be  used  suboptimally  even  where  we  do  not  know  all  possible 
classes  of  objects,  by  assigning  an  unclassified  object  to  the  known  class  which  maxi¬ 
mizes  the  classical  likelihood  function.  Maximum  likelihood  estimation  is  identified  by 
some  [10,  33,  36,  82,  106]  [224:6]  [218:215]  to  be  a  distinctively  important  technique  for 
multisensor  fusion,  although  other  authors  [145,  146]  do  not  explicitly  discuss  it,  presum¬ 
ably  lumping  it  together  with  other  probabilistic  techniques.  It  is  important  to  understand, 
however,  that  the  output  of  a  maximum  likelihood  estimator  need  not  be  a  probability 
measure  per  se,  as  we  will  discuss  in  Sect.  2.7. 

Considering  the  use  of  maximum  likelihood  classification  when  all  classes  are  not 
known,  the  reader  may  observe  that  there  is  an  unquantified  probability  that  the  observed 
measurements  do  not  correspond  to  any  of  the  known  objects.  For  cases  like  this,  where 
some  lack  of  information  introduces  the  need  to  quantify  uncertainty,  the  Dempster-Shafer 
technique  is  available  [8,  33,  218].  The  Dempster-Shafer  methodology  has  been  called  a 
generalization  of  Bayes’  Rule  to  allow  for  uncertainty  [33:381-386].  The  Dempster-Shafer 
analog  to  probability  is  probability  mass,  representing  knowledge.  This  mass  (totalling  to 
a  value  of  one)  may  be  allocated  in  combinations  of  four  different  ways:  (1)  to  support 
(confirmation)  of  any  one  particular  hypothesis  (say  7^);  (2)  to  uncertainty  (say  U),  con¬ 
firming  no  hypothesis;  (3)  to  a  disjunction  of  hypotheses  -  say  Hi  v  H2  v  •  •  t/  Hn,  where 
v  (or)  indicates  logical  disjunction  (i.e.,  information  that  indicates  that  H\,  or  some  other 
hypothesis,  is  true);  or  finally  (4)  to  negation  of  any  particular  hypothesis  Hi,  confirming 
for  example  that  Hi  is  not  true.  All  mass  not  allocated  to  the  negation  of  a  particular 
hypothesis  is  taken  to  indicate  the  plausibility  of  that  hypothesis. 

Just  as  with  a  Bayesian  classifier,  where  the  addition  of  measured  information  ideally 
causes  the  a  posteriori  probability  of  one  hypothesis  to  rise  above  the  others,  the  addition 
of  information  to  a  Dempster-Shafer  classifier  causes  probability  mass  to  be  reallocated, 
ideally  causing  a  preponderance  of  mass  to  be  assigned  to  the  support  category  for  one 
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of  the  known  hypotheses.  When  hypotheses  do  not  overlap,  and  there  is  no  uncertainty, 
Bayes’  Rule  and  Dempster-Shafer  logic  produce  equivalent  results. 

2.6.2  Multisensor  Fusion  for  Dynamic  Objects.  In  this  section,  we  will  address 
the  question  of  why  a  multisensor  fusion  algorithm  which  fuses  feature  observable  and 
kinematic  information  should  be  able  to  recognize  or  discriminate  objects  better  than  an 
algorithm  using  either  type  of  information  alone.  The  fundamental  point  to  be  made  here 
is  that  by  increasing  the  number  of  test  conditions  that  am  unknown  object  class  must 
“pass”  in  order  to  be  identified  as  a  member  of  a  known  class,  we  reduce  the  probability 
that  an  unclassified  object  could  pass  the  tests  for  an  incorrect  class.  Said  another  way, 
adding  kinematic  or  other  requirements  reduces  the  joint  likelihood  that  the  unclassified 
object  could  exhibit  the  particular  combination  of  behavior  corresponding  to  an  incorrect 
class. 

These  observations  regarding  the  significance  of  joint  likelihood  for  improved  mul¬ 
tisensor  fusion  and  object  recognition  are  not  original  to  this  effort.  The  unique  aspects 
of  this  research  are  that  we  provide  (1)  new  understanding  as  to  why  it  is  important  to 
consider  the  joint  likelihood  of  kinematic  and  “nonkinematic”  or  feature  observable  events, 
and  (2)  a  class  of  tools  -  sequence  comparison  algorithms  and  multiple  model  estimators 
(e.g.,  composed  of  Kalman  filters)  -  for  exploiting  the  joint  likelihood  of  observed  events 
over  time  in  dynamic  object  or  target  recognition. 

Most  automatic  object  recognition  algorithms  use  feature  observable  information 
only  -  those  that  are  Bayesian  in  nature,  providing  an  a  posteriori  probability  of  class 
membership  as  in  Eqn.  (2.1),  can  be  characterized  as  providing  the  a  posteriori  proba¬ 
bility  p(wi  |  Zfk )  that  we  are  actually  observing  an  object  of  class  u>j,  given  a  set  of  k 
feature  or  signature  measurements  Z{  =  {z{,  z{ ,  . . . ,  z{}  (where  the  subscripts  imply 

measurement  respectively  at  discrete  times  tx  through  tk,  and  the  superscript  stands  for 
feature),  and  a  priori  object  class  probabilities  p(Ui)  for  each  of  J  known  object  classes. 
Classically,  the  measurements  are  treated  as  independent  in  time,  or  from  measurement  to 
measurement. 
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Using  a  Bayesian  approach  and  fusing  kinematic  information,  however,  we  would  re¬ 
ally  desire  to  produce  a  pattern  recognition  system  that  estimates  the  a  posteriori  probabil¬ 
ity  p(u>i  |  Z{ ,  Z^)  that  we  are  actually  observing  an  object  of  class  ,  given  a  set  of  k  feature 
or  signature  measurements  as  above,  m  kinematic  measurements  Z^  =  {zf ,  z$,  Z3, . . . ,  z^} 
(note  use  of  superscript  d,  as  in  “dynamic”,  rather  than  k  as  in  “kinematic”,  since  k  is 
a  counting  index  in  the  Larson  and  Peschon  form),  and  a  priori  object  class  probabilities 
p( u>i).  The  “kinematic”  measurements  correspond  classically  to  positions  and  velocities  at 
discrete  times  tk  through  tm  -  the  likely  relationship  between  t1  through  tk  and  tt  through 
tm  is  discussed  in  Sect.  3.6.5. 

This  ideal,  but  in  practice  generally  unobtainable,  system  would  consist  of  J  func¬ 
tions,  one  for  each  object  class,  having  a  domain  of  the  space  of  all  measurements  over 
time  and  a  range  on  the  interval  of  the  real  line  from  zero  to  one,  with  the  sum  of  the  J 
function  values  equal  to  one  (or  less  than  one,  if  we  wish  to  allow  for  unknown  classes). 
Following  Rao  [184:353],  however,  in  the  absence  of  pfa  |  Zj[,Z£j  (or  equivalently,  the 
joint  probability  density  p(u>,,  Zj[,  Z^)  of  object  class  and  measurements),  we  must  be  con¬ 
tent  to  find  a  set  of  “generalized”  likelihood  functions  such  that  the  maximum  value  for 
each  function  is  attained  for  the  correct  combination  of  object  class  and  measurements. 
This  likelihood  function  may  not  provide  a  probability  measure  per  se. 

More  specifically,  a  generalized  likelihood  function  L[x,  y ,  Z*]  [154:75]  is  simply  a  real¬ 
valued  function  of  states  x,  parameters  y ,  and  measurement  history  Z*  (i.e.,  measurements 
*i>  z2-,  z3  ■  •  up  through  and  including  some  latest  measurement  zk  at  time  tk),  which  is 
defined  a  priori  for  a  given  set  of  states  x  (a  maneuver)  and  parameters  y  (an  object).  The 
distinction  between  states  and  parameters  made  here  is  again  that  of  Maybeck  [154:69],  as 
in  Sect.  2.3. 1.3.  We  require  only  that  this  function  consistently  achieve  a  maximum  value 
for  measurements  Z*  taken  from  a  truth  model  or  actual  system  having  the  same  values 
of  (object)  parameters  and  states  -  in  other  words,  the  likelihood  function  is  a  “matched 
filter”  which  should  exhibit  high  “gain”  only  for  signals  with  its  design  specifications. 
Clearly,  the  object  signature  libraries,  aspect  angle  “windows”,  and  associated  metrics 
used  in  classical  automatic  object  recognition  (e.g.,  [20])  are  likelihood  functions. 
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What  can  we  gain  by  considering  the  kinematics  of  the  unknown  object?  Consider 
an  abstract  space  0  of  all  possible  object  models  and  aspect  angles  over  time  as  part  of 
the  domain  of  a  matching  function,  £.  The  remainder  of  the  matching  function  domain 
lies  in  another  abstract  space  Z{  of  noise-corrupted  feature  observations.  The  matching 
function  operates  on  an  element  from  0  and  an  element  from  Zf  to  produce  a  scalar  value, 
the  magnitude  of  which  is  some  measure  of  the  likelihood  that  the  chosen  element  in  © 
could  produce  the  observed  element  in  Zf . 

A  classical  generalized  likelihood  function  for  recognition  of  a  particular  object  class 
<j)i  is  defined  by  restricting  the  domain  of  £  to  produce  an  £,  with  domain  S,  C  ©  cor¬ 
responding  to  (Vi.  Typically,  we  match  sets  of  noise-corrupted  feature  observations  from 
abstract  space  Zf  over  time  to  elements  in  the  first  space.  Unfortunately,  these  classical 
decision  theoretic  functions  may  give  higher  likelihoods  than  ideal  for  the  wrong  object 
class,  in  part  because  kinematically  unlikely  aspect  angles  (pose  estimates)  and  aspect 
angle  transitions  over  time  are  allowed.  Conventional  algorithms  that  limit  searches  in 
aspect  angle  to  “windows”  defined  by  kinematics  are  a  step  in  the  right  direction,  but 
these  algorithms  are  still  “independent  look”  algorithms  as  defined  in  Sect.  2.2.1,  and  wild 
pose  estimate  transitions  may  occur  from  measurement  to  measurement. 

The  key  to  the  author’s  approach  is  to  restrict  the  domain  of  each  matching  function 
Ci  further,  requiring  the  object  aspect  angle  over  time  to  be  consistent  with  the  observed 
kinematics  and  vice  versa,  since  this  restriction  (correctly  executed)  should  not  adversely 
affect  matching  or  likelihood  function  values  for  measurements  from  the  correct  object  class, 
but  may  lower  the  values  given  by  matching  functions  corresponding  to  incorrect  object 
classes.  This  approach  for  understanding  the  need  to  fuse  “kinematic”  and  “nonkinematic” 
or  feature  observable  information  is  an  original  contribution  of  this  effort.  Despite  the 
general  lack  of  a  full  expression  for  the  joint  classical  likelihood  p(wj,  Z[,  of  object 
class  and  measurements,  we  still  seek  to  exploit  the  joint  nature  of  observable  events  by 
finding  the  object  class  most  likely  to  have  exhibited  simultaneously  the  behavior  observed 
over  time  in  several  domains  (kinematic,  feature  observable,  etc.). 

As  we  will  show,  prior  efforts  have  moved  in  this  direction  by  restricting  the  matching 
or  likelihood  function  domain  to  be  consistent  with  feasible  kinematics  [136),  and,  in  a 
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suboptimal  fashion,  consistent  with  observed  kinematics  [164, 165, 163].  By  further,  optimal 
restriction  using  observed  kinematics,  we  will  achieve  a  more  highly  “tuned”  likelihood 
function  (again  using  the  matched  filter  analogy).  Simply,  restricting  the  matching  domain 
of  a  likelihood  function  according  to  kinematics  is  the  analog  of  conditioning  p(u>j  |  Z{), 
were  it  known,  on  the  added  information  given  by  kinematic  measurements  ZjJ,. 

The  preceding  discussion  correctly  implies  that  we  will  always  have  equal  or  better 
recognition  performance  in  identifying  a  particular  object  in  a  particular  maneuver  if  we 
progressively  restrict  the  domain  (in  time,  space,  frequency,  or  any  other  dimension)  of 
each  candidate  matching  or  likelihood  function  to  subsets  that  exhibit  only  the  correct 
combination  of  behavior  for  the  corresponding  object  -  and  no  more.  The  correct  object  - 
to-likelihood  function  match  will  still  indicate  high  likelihoods  -  likelihoods  for  incorrect 
matches  are  more  likely  to  fall. 

Mathematically,  the  concept  of  restricting  the  matching  domain  for  equal  or  better 
performance  can  be  expressed  as  shown  in  Eqn.  (2.36)  below  -  for  any  likelihood  function 
Ci  corresponding  to  object  class  a;,,  with  the  kinematically  unrestricted  and  restricted 
matching  domains  denoted  respectively  by  S*  and  T< ,  we  can  show  trivially  by  contradiction 
that: 


for  T<  C  Si : 


sup  Ci(Ti,  Zf)  <  sup  £i(Sj,  Z})  (2.36) 

The  preceding  equation  does  not  imply  that  any  matching  domain  restriction  will 
improve  recognizer  performance.  Domain  restriction  must  be  done  correctly  -  no  likelihood 
function  for  an  object  class  should  fail  to  include  areas  in  the  domain  that  may  contain 
feasible  sets  of  behavior  for  that  class.  Since  theoretically  optimal  domain  restrictions  may 
never  be  known,  we  expect  that  some  tuning  process  may  be  required  to  define  empirically 
optimal  restrictions. 

Using  the  same  approach  from  another  perspective,  we  may  force  kinematics  to  be 
consistent  with  aspect  angle  estimated  from  feature  observations.  Where  an  incorrect 
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unknown  object-to-library  match  forces  a  tracker /filter  dynamic  model  to  deviate  strongly 
from  the  true  dynamics,  this  mismatch  will  be  immediately  evident  through  poor  tracking 
performance,  large  filter  residuals,  and  other  factors.  Note  that,  in  effect,  we  have  extended 
the  abstract  space  0  to  include  not  only  object  models  and  aspect  angles,  but  kinematic 
(translational  states  and  derivatives,  and  derivatives  of  the  fundamental  rotation  states 
associated  with  aspect  angles)  and  time  dimensions  as  well  -  and  our  matching  process 
now  requires  consistency  across  all  dimensions,  i.e.,  high  joint  likelihood. 

This  joint  likelihood  of  kinematics  and  feature  observables  can  be  exploited  in  a 
Bayesian  structure,  naturally  expressed  in  practice  as  a  generalized  likelihood  function. 
This  is  the  subject  of  the  next  chapter. 

Finally,  recalling  the  discussion  of  syntactic  pattern  recognition  in  Sect.  2.2.2,  note 
that  the  preceding  discussion  is  implicitly  an  argument  for  the  application  of  syntactic 
methods  in  object  recognition.  In  effect,  we  have  said  not  only  that  particular  pattern 
primitives  are  characteristic  of  particular  objects,  but  their  order  of  presentation,  and  the 
productions  which  govern  transitions,  are  as  well.  The  extremely  powerful  observation 
that  syntactic  techniques  apply  in  target  recognition  was  first  made  by  Le  Chevalier  [136]. 
Another  observation  by  Le  Chevalier  in  the  same  source  was  that  syntactic  dynamic  pro¬ 
gramming  methods  would  reduce  ambiguity  in  target  recognition. 

This  last  observation,  independently  made  by  the  author  and  focused  further  in  this 
research,  led  to  the  choice  of  generalized  ambiguity  functions  for  evaluating  the  performance 
of  the  proposed  “generalized”  likelihood  function  algorithms.  The  ease  of  making  an  ana¬ 
lytical  prediction  of  performance  for  a  generalized  likelihood  function  depends  on  the  nature 
of  the  function.  Any  such  function,  however,  can  be  evaluated  in  experiment  or  simulation 
with  the  use  of  a  generalized  ambiguity  function  (as  proposed  by  Schweppe  [198,  154]), 
which  is  the  subject  of  Sect.  2.7  below. 

2.7  Generalized  Ambiguity  Functions 

This  section  follows  the  development  given  by  Maybeck  [154:96-101]  to  motivate  the 
use  of  generalized  ambiguity  functions  (GAF)  for  assessing  maximum  likelihood  estimator 
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performance,  as  developed  by  Schweppe  [198:376-381].  The  generalized  ambiguity  function 
is  an  extension  of  the  classical  radar  ambiguity  function  [57,  16]  used  to  define  radar  pulse 
parameters  for  optimum  information  quality  (balancing  range  resolution  vs.  doppler  error, 
for  example)  according  to  system  requirements. 

The  result  of  the  research  described  in  this  dissertation  is  an  object  recognition 
approach  which  seeks  to  reduce  ambiguity  in  recognition  decisions  by  fusing  all  available 
information  over  short  periods  of  time.  The  generalized  ambiguity  function  discussed  in 
this  section  will  be  shown  to  be  a  natural  tool  for  evaluating  the  performance  of  this 
research  product  and,  in  general,  any  object  recognition  schemes  that  can  be  viewed  as 
maximum  likelihood  estimators. 

The  generalized  ambiguity  function  is  defined  by  the  following  equation: 


■4t(ft> f!t)  =  f  '■■/  -£[^» -2*]/z(4»)|n(t*)(-2jt  |  flt)dZk  (2.37) 

J — oo  j  —  oo 

where: 

ftt  =  the  particular  combination  of  states  x(t)  and  parameters  y  (generally  constant 
over  the  time  interval  of  interest)  for  the  truth  system  which  generates  the  set  of  all  possible 
measurement  histories  Zk  over  which  the  integral  is  taken  -  a  likelihood  function  L  defined 
for  fit,  operating  on  an  element  of  this  measurement  history  set,  will  ideally  generate  a 
higher  value  than  will  any  L  defined  for  some  other  value  of  ft,  operating  on  an  element  of 
this  measurement  history  set  (the  ambiguity  function  evaluates  the  extent  to  which  this  is 
true  in  the  mean) 

ft  =  state/parameter  values  for  which  the  generalized  likelihood  function  is  defined, 
for  evaluation  against  measurements  generated  by  a  truth  model  with  state  and  parameter 
values  ftt 

w4*(ft,  ftt)  =  the  generalized  ambiguity  function,  a  function  of  ft  for  a  given  ftt  and 
likelihood  function  L 
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T[ft,  2k\  =  the  generalized  likelihood  function,  a  function  of  Zk  when  defined  for  a 
given  ft  (note  that  the  calligraphic  letter  “Z,”  or  Z,  is  used  as  the  “dummy”  form  of  Z, 
appropriate  for  showing  functional  relationships  in  an  integrand) 

fz(th)\n(tk)(Zk  I  ftt)  =  the  probability  density  function  of  the  measurements,  given 
that  the  true  states  and  parameters  have  the  value  ftt 

Z*  =  the  measurement  history  vector  as  of  time  tk,  as  defined  in  Sect.  2.6 

The  reader  must  not  allow  the  complexity  of  Eqn.  (2.37)  to  obscure  the  essential 
simplicity  of  the  concept.  The  generalized  ambiguity  function  is  the  expected  (or  mean) 
value  of  a  generalized  likelihood  function  defined  for  varied  combinations  of  states  and 
parameters,  conditioned  on  the  true  states  and  parameters  having  particular  values.  For 
any  particular  value  of  ft  defining  the  likelihood  function,  there  is  in  fact  a  distribution 
of  likelihood  function  values  produced,  due  to  the  different  realizations  of  measurements 
produced  by  a  system  with  true  states  and  parameters  ftt. 

Examining  the  behavior  of  the  ambiguity  function  for  each  realizable  value  of  ft, 
and,  for  each  such  value  of  ft,,  a  range  of  ft  encompassing  reasonable  values  of  states  and 
parameters  expected  other  than  at  ft, ,  we  desire  that  the  ambiguity  function  have  an  easily 
discernible  global  maximum  at  ft,  -  i.e,  that  local  maxima,  if  present,  are  “widely”  sepa¬ 
rated  from  the  global  maximum  at  ft,.  Fig.  2.10  shows  examples  of  “better”  (than  good), 
“good,”  “mediocre,”  and  “poor”  ambiguity  functions,  corresponding  to  different  likelihood 
functions  of  respective  quality  defined  over  the  same  domain  of  state  and  parameter  values. 

It  is  important  to  understand  the  intent  behind  the  grading  terms  applied  in  Fig.  2.10. 
The  “good”  ambiguity  function  is  so  rated  because  it  has  a  single  peak  value,  peaking  at 
the  value  of  ft  =  ft,  as  desired,  with  the  peak  “rolling  off”  reasonably  quickly  as  ft 
moves  away  from  ft,.  The  “better”  ambiguity  function  is  so  rated  because  it  has  all  of 
the  qualities  of  the  “good”  function,  but  “rolls  off”  even  more  quickly  (more  precisely,  the 
second  derivative  or  curvature  of  the  “better”  function  at  the  peak  value  is  more  negative). 
This  means  that  an  identification  of  true  state  and  parameter  values  at  ft,  can  be  made 
with  less  ambiguity  than  with  the  “good”  function. 
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The  “mediocre”  mbiguity  function  may  be  so  rated  because  of  any  of  several  non¬ 
ideal  factors.  First,  it  has  peaks  other  than  the  one  at  fit,  so  that  a  likelihood  function 
parameterized  for  the  wrong  values  may  give  a  response  close  to  the  likelihood  function 
parameterized  for  the  correct  value.  This  may  lead  to  difficulty  in  making  a  correct  decision 
as  to  which  true  but  unknown  state/parameter  set  is  present.  Second,  its  main  peak  does 
not  have  a  maximum  value  exactly  at  flt .  Third,  this  ambiguity  function  has  slower  rolloff 
than  the  “good”  function.  Finally,  the  “poor”  function  reflects  the  undesirable  qualities  of 
the  mediocre  function  to  a  particularly  exaggerated  degree. 

Furthermore,  particularly  for  applications  where  we  will  be  able  to  extract  only  a 
limited  number  of  likelihood  function  values  prior  to  making  a  state/parameter  identifi¬ 
cation,  we  desire  that  the  likelihood  function  values  for  any  combination  of  fl  and  flt  be 
closely  distributed  about  the  mean  or  ambiguity  function  value.  This  need  drives  us  to 
consider  determining  not  only  the  ambiguity  function,  but  also  higher  level  moments  of 
the  likelihood  function  probability  density  function  for  likely  combinations  of  and  ilt. 
Ambiguity  functions  can  be  developed  analytically  for  some  likelihood  functions  [152, 154], 
or  in  any  case  empirically  by  Monte  Carlo  simulation  runs  or  extended  experimentation. 
Likewise,  likelihood  function  probability  densities  might  presumably  be  found  analytically 
in  some  cases  (as  an  extension  to  [152],  for  example),  but  could  also  be  estimated  by  Monte 
Carlo  or  experimental  research. 

To  relate  these  curves  to  the  classical  “probabilities  of  correct /incorrect  recognition”, 
etc.,  generally  used  in  ATR,  note  that  a  horizontal  “threshold”  line  could  be  drawn  through 
these  curves.  The  likelihood  function  probability  density  functions  (recall  that  the  general¬ 
ized  ambiguity  function  is  the  mean)  for  the  true  parameter  point  (object)  and  some  other 
parameter  point  (object)  then  allow  one  to  establish  the  relative  probability  of  likelihood 
function  values  above  or  below  the  threshold  at  either  object.  This  allows  an  immedi¬ 
ate  estimate  of  probability  of  correct/incorrect  recognition,  etc.,  for  that  threshold.  This 
can  only  be  an  estimate  of  such  probabilities,  however,  because  the  generalized  ambiguity 
function  is  a  mean  value  over  many  (ideally,  all  possible)  measurements,  and  as  such,  it 
obscures  correlations  between  responses  for  individual  measurements  that  could  tend  to 
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cause  likelihood  functions  for  both  of  two  different  objects  to  fall  above  or  below  some 
threshold. 

If  a  likelihood  function  is  expressed  as  a  natural  logarithm  (so  that  the  first  and  second 
derivatives  give  a  “non-dimensional”  or  scaled  slope  and  “curvature”  of  the  likelihood 
function),  the  curvature  of  the  corresponding  ambiguity  function  near  the  value  of 
can  be  related  to  the  Cramer-Rao  lower  bound  (CRLB)  [184]  of  the  covariance  for  a 
state/parameter  estimate  obtained  by  the  use  of  the  underlying  likelihood  function  [154, 
198].  For  Gaussian  likelihood  functions,  which  are  naturally  expressed  in  log  form,  this 
relationship  is  particularly  powerful. 

Cramer-Rao  lower  bounds  and  the  class  of  analogous  bounds  called  Cramer- Rao-like 
lower  bounds  are  of  much  interest  in  estimation  theory,  since  they  define  a  theoretical  bound 
on  the  quality  of  information  that  can  be  inferred  from  an  estimator  [125,  34,  210].  For 
a  vector  parameter  rt  estimated  by  some  process  yielding  an  unbiased  estimate  ft,  where 
the  true  parameter  value  set  is  given  by  the  Cramer-Rao  lower  bound  on  maximum 
likelihood  estimate  error  variance  is  given  by: 

d 2 

E{[n  -  n,][n  -  nt]T}  >  Ak(si,  nt)  (2.38) 

1  n=n,_ 

where  is  the  ambiguity  function  corresponding  to  the  log  likelihood  function, 

and  all  terms  have  been  defined  previously.  This  expression  provides  a  lower  bound  as 
well  for  biased  estimates,  for  which  a  “tighter”  lower  bound  can  be  defined  if  the  bias  is 
well-characterized  [154:97]. 

Recently,  interest  has  been  directed  toward  defining  and  using  Cramer- Rao-like  lower 
bounds  for  object  tracking  [60,  62,  61,  39].  No  analogous  bound  has  been  published  for 
dynamic  object  or  target  recognition,  and  at  least  one  recent  article  has  solicited  such 
bounds  on  performance  [30].  It  will  be  noted  in  Chapter  V  that  evaluation  of  object 
recognition  algorithms  by  generalized  ambiguity  functions  offers  a  natural  extension  to 
the  concept  of  a  Cramer-Rao-like  lower  bound  for  object  recognition,  although  in  fact  the 
ambiguity  function  or  classical  probabilities  of  correct /incorrect  recognition  may  be  more 
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meaningful  approaches  by  which  to  define  recognition  performance.  Also  in  Chapter  VI, 
we  will  discuss  the  utility  of  a  Cramer-Rao  lower  bound  in  object  recognition. 

In  1970,  Altes  [1]  evidently  recognized  the  applicability  of  maximum  likelihood  esti¬ 
mation  and  a  generalized  ambiguity  function  (defined  rather  differently  them  the  concept 
by  Schweppe  referred  to  above)  to  object  recognition,  but  Altes’  proposals  have  not  been 
exercised  in  the  open  press  (or  in  the  classified  press,  so  far  as  can  be  determined).  Sub¬ 
sequent  researchers  developing  Altes’  approach  [2,  7,  173,  200]  have  been  concerned  only 
with  maximum  likelihood  state  estimation  (as  for  locating  submerged  objects  with  sonar) 
rather  than  parameter  estimation  as  for  object  recognition  (note  that  some  of  the  refer¬ 
enced  articles  use  the  term  “parameters”  to  refer  to  position  information  which  we  would 
term  “states”). 

Aside  from  the  writing  of  Altes,  so  far  as  the  author  has  been  able  to  determine  in  an 
extensive  survey  (including  all  references  listed  in  the  bibliography),  neither  the  concept  of 
generalized  likelihood  functions  nor  the  generalized  ambiguity  function  has  been  employed 
in  multisensor  fusion  for  object  recognition,  at  least  in  the  explicit  estimation  theoretic 
sense  in  which  they  have  been  employed  for  estimation  of  states  and  parameters  [152,  154, 
198].  It  seems  clear,  however,  that  many  published  systems  could  be  analyzed  from  this 
perspective  -  it  is,  however,  the  author’s  firm  impression  that  they  have  not  been.  This 
effort  has  done  so. 

2.8  Conclusion 

With  the  conclusion  of  this  chapter,  we  have  in  place  all  of  the  tools  with  which  to 
define  and  conduct  the  proposed  research.  In  particular,  we  have  discussed  two  classes  of 
algorithms  -  the  Kalman  filter  and  related  estimators,  and  dynamic  programming  sequence 
comparison  -  and  their  previous  application  in  state  and  parameter  identification  and 
syntactic  pattern  recognition. 

In  following  chapters,  we  will  develop  the  theory  and  practice  for  applying  these 
tools  -  separately  and  together  -  in  dynamic  object  recognition.  Our  intent  in  each  case 
will  be  to  exploit  for  recognition  the  characteristic  coupling  between  states,  parameters, 
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and  measurements  of  dynamic  physical  objects.  We  will  use  knowledge  of  such  coupling 
and  sequences  of  measurements,  evaluating  the  consistency  of  measurement  sequence  gen¬ 
eration  with  the  known  coupling  for  each  object  class.  Knowledge  of  characteristic  coupling 
allows  us  to  consider: 

(1)  The  joint  likelihood  of  observed  events  over  time  from  known  target  classes,  conditioned 
on  past  measurements  and  a  priori  information  for  each  class. 

(2)  The  syntax  of  observed  events  from  unclassified  objects,  by  comparison  with  the  syn¬ 
tax  of  event  sequences  expected  from  known  target  classes.  For  known  physical  objects, 
the  process  of  generating  sequences  or  sequence  spaces  of  expected  elements  for  each  class 
inherently  considers  the  joint  or  coupled  nature  of  the  processes  which  produce  the  se¬ 
quences. 

(3)  Restrictions  on  the  domains  of  likelihood  functions  used  to  identify  known  object  classes, 
according  to  joint  or  coupled  behavior  expected  over  time.  We  will  reject  object-class 
associations  that  do  not  fit  reasonable  restrictions. 

These  three  considerations  are  simply  different  ways  of  making  the  same  statement. 
The  only  differences  in  our  application  of  these  considerations  from  case  to  case  will  be 
driven  by  the  limitations  of  the  available  tools.  We  will  see  that  the  Kalman  filter  and 
dynamic  programming  sequence  comparison  techniques  possess  a  combination  of  charac¬ 
teristics  which  allow  them  together  to  exploit  joint  likelihood  for  objects  with  linear  and 
nonlinear  state  and  measurement  spaces  -  a  frequent  combination  for  dynamic  physical 
objects. 

The  remainder  of  this  dissertation  will  refer  to  the  concepts  discussed  in  this  chapter 
to  put  the  author’s  research  into  perspective  with  previous  efforts.  The  next  chapter  lays 
out  the  major  functional  elements  of  the  author’s  contribution.  Subsequent  chapters  will 
demonstrate  those  elements. 
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III.  Exploiting  Joint  Likelihood  in  Object  Recognition 
3. 1  Introduction 

In  this  chapter,  we  consider  approaches  for  exploiting  the  joint  likelihood  of  “kine¬ 
matic”  (classically,  position  and  velocity)  and  “nonkinematic”  (sensor  signature  or  feature 
observable)  measurements  for  dynamic  object  and  target  recognition.  We  will  examine 
conventional  and  unconventional  estimator  structures,  and  define  constructs  whereby  these 
structures  can  be  combined  to  perform  specific  tasks  in  particular  situations.  All  of  these 
estimators  will  involve  forms  of  sequence  comparison ,  since  measurements  from  an  unclas¬ 
sified  physical  object  occur  naturally  over  time  in  sequences  that  contain  much  information 
about  the  joint  likelihood  of  their  generation  by  any  particular  a  priori  known  object  class. 

Recall  that  our  objective  is  to  make  improved  (more  probably  correct)  estimates  of 
object  class,  based  or  conditioned  on  measurements  from  all  available  sensors.  Following 
the  discussion  in  Sect.  2.6.2,  we  aspire  to  produce  a  pattern  recognition  system  that  gives 
the  a  posteriori  probability  p(u>i  |  Zj[ ,  Z^J  that  we  are  actually  observing  an  object  of  class 
uix ,  given  a  set  of  k  feature  observable  measurements  Z(  —  {z{ ,  z{ ,  , . . . ,  z{  } ,  m  kinematic 

or  dynamic  measurements  Z?n  =  {zf ,  Zj,  z*, . . . ,  z^},  and  a  priori  object  class  probabilities 
p{u>i)  for  each  of  J  known  object  classes.  Historically  in  general,  however,  as  noted  in 
Sects.  2.2  and  2.6.2,  only  feature  observable  (“nonkinematic”)  information  Z[  has  been 
used  for  object  recognition,  although  kinematic  information  has  been  used  to  limit  search 
windows. 

Expanding  significantly  the  approach  of  Therrien  [211],  we  will  not  be  content  to  limit 
ourselves  to  cases  in  which  it  is  possible  to  make  a  linear  prediction  of  future  measurements, 
based  on  current  state  estimates  derived  from  previous  measure  its.  -  I  <»r  feature  spaces 
which  are  highly  nonlinear  functions  of  an  underlying  dynamv  space  subject  to 

“high  frequency”  variations  or  unpredictable  transitions  (radar  ure  as  a  function  of 

aspect  angle,  in  particular),  linear  or  linearized  prediction  is  impossible  in  a  practical  sense. 
Therefore,  we  now  set  out  to  extend  the  status  quo  of  dynamic  object  /  tactical  target 
recognition  in  three  steps. 
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First,  we  will  define  a  new  class  of  recognition  algorithms  based  on  kinematic/aspect 
trackers  (recalling  Sect.  2.3.3)  that  combine  kinematic  information  while  obeying  feature 
observable  constraints.  These  algorithms  will  exploit  the  joint  likelihood  of  observed  kine¬ 
matics,  conditioned  on  feature  observable  or  signature  measurements.  The  results  will 
provide  major  practical  extensions  to  the  proposals  of  Therrien  and  Eagle  (the  latter  be¬ 
ing  discussed  previously  in  Sect.  2.3.3. 1). 

Second,  we  will  define  a  different  class  of  algorithms  that  combine  feature  observ¬ 
able  information  while  obeying  feasible  or  observed  object  kinematic  constraints.  These 
algorithms  will  exploit  the  joint  likelihood  of  measured  feature  observables,  conditioned 
on  kinematic  measurements.  The  results  will  provide  significant  theoretical  and  practical 
extensions  to  the  efforts  of  Le  Chevalier  et  al.  [136]  and  Mieras  et  al.  [164,  165]. 

Either  of  these  algorithm  classes  can  be  used  for  stand-alone  object  recognition,  and 
that  is  a  primary  approach  taken  in  this  research.  Third,  however,  we  will  show  how 
these  two  approaches  can  be  combined  to  yield  a  new  estimator  structure  that  uses  the 
joint  likelihood  of  kinematic  and  feature  observable  observations,  conditioned  on  previous 
measurements  from  both  domains  and  other  a  priori  information,  for  real-time  recogni¬ 
tion  of  dynamic  objects.  This  estimator  structure  shows  considerable  promise  for  effi¬ 
cient  state  and  parameter  estimation  in  cases  involving  both  linear  and  nonlinear  state 
space/measurement  space  relationships.  Although  timelines  for  this  research  did  not  per¬ 
mit  implementation  and  testing  of  the  third  construct,  the  theory  and  structure  for  design 
of  such  a  new  estimator  are  laid  out  clearly.  All  of  these  developments  will  comply  with 
well-understood  practices  of  Bayesian  parameter  estimation. 

The  next  two  sections  discuss  terminology  and  factors  that  bear  on  measurements  and 
estimation  for  dynamic  objects  in  general  and  tactical  targets  in  particular.  Subsequent 
sections  address  new  recognition  schemes.  The  last  section  in  this  chapter  motivates  new 
techniques  for  evaluating  algorithm  performance.  Techniques  considered  include  classical 
probabilities  of  correct  and  incorrect  identification,  as  well  as  the  generalized  ambiguity 
function  introduced  in  the  previous  chapter. 
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3.2  Spaces  and  Dimensions. 

In  the  subsequent  discussions,  it  will  be  necessary  to  deal  with  at  least  four  different 
concepts  of  spaces  and  their  associated  dimensions.  All  have  been  previously  introduced  in 
this  dissertation,  but  their  subsequent  close  association  may  cause  some  confusion,  so  they 
are  revisited  here  for  contrast  and  comparison.  These  are  (1)  physical  or  three-space,  (2) 
aspect  angle  space,  (3)  feature  or  feature  observable  space,  and  (4)  warping  path  space. 

3. 2. 1  Physical  or  Three-Space.  This  is  the  usual  Euclidean  representation  of  three 
dimensional  physical  space.  In  general,  the  behavior  of  real  objects  in  this  space  is  described 
by  positions  and  higher  derivatives  in  six  degrees-of- freedom  (“6-DOF”),  corresponding  to 
three  translational  and  three  rotational  degrees  of  freedom  -  translation  and  rotation  states. 
Behavior  can  only  be  quantified  relative  to  some  reference  frame,  generally  Cartesian, 
which  may  be  stationary  or  non-stationary  in  physical  space.  Position  or  displacement  in 
the  rotational  degrees  of  freedom  is  referred  to  as  angular  orientation. 

It  is  important  to  note  that  for  a  6-DOF  object,  the  term  kinematics  properly  refers 
to  both  translational  and  rotational  state  behavior.  As  we  have  noted  in  previous  chapters, 
however,  the  general  inability  to  measure  rotation  state  directly  using  remote  sensors 
has  led  to  the  use  of  the  term  “  ‘kinematic’  measurements”  to  refer  to  measurements  of 
translational  state  variables  only  -  classically,  position  and  velocity.  In  this  development, 
the  distinction  will  be  clarified  where  required. 

3.2.2  Aspect  Angle  Space.  This  space  is  the  entire  4 ir  steradian  extent  of  the 
hypothetical  aspect  angle  sphere  (equivalently,  the  surface  of  the  unit  sphere)  as  shown  in 
Fig.  1.2,  which  is  closed  under  allowable  transitions  on  that  sphere,  and  which  is  therefore 
considered  an  aspect  angle  “space."  It  may  also  be  necessary  to  speak  of  a  region  or 
“window”  on,  or  subset  of,  the  hypothetical  aspect  angle  sphere.  The  entire  aspect  angle 
space  and  regions  in  general  are  inherently  two-dimensional,  if  we  consider  only  in-plane 
rotation-invariant  (see  “PSRI,”  in  App.  A)  feature  observables.  These  cases  may  require 
specification  of  feature  observable  values  as  a  function  of  object-frame  relative  azimuth  and 
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elevation  angles,  as  shown  in  Fig.  5.1,  or  as  a  function  of  the  equivalent  object-to-sensor 
unit  vector  in  the  object  body  frame. 

In  dealing  with  a  non-rotation-invariant  feature  observable,  or  where  a  transformation 
is  required  from  aspect  Jingle  to  object  rotation  state  with  respect  to  some  frame  in  physical 
space  (discussed  below),  a  third  dimension  -  roll  about  the  object-sensor  vector  -  must 
be  added  to  specifications  of  aspect  angle.  An  aspect  angle  path,  as  seen  in  Fig.  1.2,  in 
azimuth  and  elevation  angle  (and  possibly  roll,  although  not  visible  in  the  figure)  is  a 
one-dimensional  aspect  angle  region  or  subset. 

Henceforth,  a  particular  aspect  angle  state  or  location  in  aspect  Jingle  state  space 
(continuous  or  discretized)  will  be  denoted  x“.  The  superscript  a  refers  to  aspect  angle, 
just  as  superscript  d  denotes  translationjd  states  (associated  with  “kinematic”  or  dynamic 
measurements).  In  generjd,  each  vjilue  x“  represents  the  orientation  of  the  observing  sensor 
relative  to  the  observed  object  or  target  body  frame  at  some  particuljir  time  of  interest. 
For  some  time  tk,  the  corresponding  aspect  angle  state  is  given  as  xjj. 

For  a  known  sensor  angular  orientation  relative  to  some  reference  frame  in  physical 
space,  three-dimensional  aspect  Jingle  is  equivalent  to  a  sensor-relative  representation  of 
the  object’s  angular  orientation  or  “rotation  state”  relative  to  that  physical  space  reference 
frjime.  Since  this  restriction  will  apply  in  generjd  for  the  discussions  here,  the  term  x“ 
may  be  used  interchjmgeably  for  sensor-relative  “aspect  angle”  or  rotation  state  relative 
to  some  other,  generjdly  stationary  reference  frame  in  three-space.  Distinctions  between 
aspect  Jingle  state  relative  to  the  sensor  and  rotation  state  relative  to  some  other  frjune 
will  be  clarified  where  necessary. 

3.2.3  Feature  or  Feature  Observable  Space.  This  is  the  classic  feature  space  dis¬ 
cussed  in  Sect.  2.2  -  a  vector  space  in  which  a  given  feature  observation  or  measurement 
of  the  object  can  be  expressed  as  a  point.  Thus,  the  dimension  of  the  feature  space  is 
determined  by  the  number  of  individual  scalar  elements  in  a  feature  observable  measure¬ 
ment  vector.  The  dimensions  may  or  may  not  be  independent  in  any  particular  sense  of 
the  word  -  functional,  statistical,  and/or  temporal  relationships  may  exist  between  feature 
space  “dimensions”.  Feature  spaces  of  dimension  n  are  generally  taken  to  be  isomor- 
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phic  [170:173]  to  Rn,  the  (Cartesian)  vector  space  defined  by  a  collection  of  n  mutually 
orthogonal  real  axes  sharing  a  common  origin. 

The  Cartesian  product  space  formed  by  the  aspect  angle  space  and  some  feature 
space  provides  a  framework  for  description  of  the  appearance  of  any  object  to  a  sensor 
system  extracting  the  given  features  as  the  object  traverses  any  given  path  in  aspect  angle. 
Each  object  model  corresponds  to  a  particular  set  of  feature  observable  values  for  each 
aspect  angle  value  in  this  Cartesian  product  space.  Due  to  the  stochastic  nature  of  feature 
observable  generation  and  measurement,  a  given  point  in  aspect  angle  space  for  any  one 
model  will  invariably  be  associated  with  many  points  in  feature  observable  space.  The 
realization  likelihood  of  particular  feature  observable  values  for  any  aspect  angle  will  be 
governed  by  some  probability  density,  which  generally  will  be  dependent  on  the  state  of 
the  object,  sensor,  and  other  factors  (consider,  for  example,  the  effect  of  tactical  target 
engine  vibrations  on  radar  returns). 

As  in  Sect.  2.6.2,  a  particular  element  (generally,  a  noisy  measurement  or  observation) 
from  a  feature  observable  space  (continuous  or  discretized)  will  be  denoted  ,  where  the 
superscript  /  refers  to  feature.  A  sequence  or  vector  composed  of  such  elements,  where  in 
general  each  element  corresponds  to  a  different  measurement  time,  will  be  denoted  Z* ,  as 
in  Sect.  2.6.2. 

3.2.4  Space  of  Allowable  Warping  Paths.  As  defined  and  discussed  in  Sect.  2.4.3, 
this  is  the  set  of  all  possible  associations  between  elements  in  two  sequences,  or  between 
a  sequence  and  a  region  from  which  sequences  can  be  extracted.  When  comparing  a 
one- dimensional  sequence  to  another  one-dimensional  sequence,  the  space  of  all  possible 
associations  is  two-dimensional,  as  shown  in  Fig.  2.8.  When  comparing  a  one- dimensional 
sequence  to  a  two-dimensional  set  of  possible  sequence  elements,  the  space  of  all  possible 
associations  is  three-dimensional,  as  shown  in  Fig.  2.9. 

3.3  Joint  Likelihood  in  Object  Recognition 

Recall  the  discussion  of  multiple  model  filters  and  parameter  identification  in  Chap¬ 
ter  II  (see  Eqns.  (2.20)  through  (2.22)),  following  the  development  in  Maybeck  Vol.  II 
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[154:129-136].  We  noted  that  conventional  (e.g.,  linear,  linearized  or  extended  Kalman) 
filter  structures  can  be  defined  readily  in  a  Bayesian,  multiple  model  residual  sequence 
analysis  framework  to  perform  parameter  estimation. 

In  that  conventional  multiple  model  parameter  identification  approach,  we  use  the 
a  priori  likelihood  of  each  model  (feasible  parameter  set)  and  -  from  residual  analysis  - 
the  joint  likelihood  of  each  model  having  produced  the  observed  measurements  at  each 
measurement  event  over  time  to  identify  the  most  probably  correct  parameter  set.  We 
noted  that  this  approach  is  intimately  related  to  the  target  recognition  approach  taken  by 
Therrien  [2 11],  and  recommended  by  Daum  [9:177-178].  In  target  recognition,  the  discrete 
sets  of  parameters  are  the  known  target  classes,  and  we  have  measurements  in  “kinematic” 
(physical  translation  space)  and  feature  observable  domains. 

For  many  object  or  target  states,  parameters,  and  measurements  of  interest,  a  con¬ 
ventional  filter  will  adequately  model  the  desired  object  behavior,  and  provide  a  basis  for 
residual  analysis;  that  is,  checking  the  deviations  over  time  between  (1)  measurements  pre¬ 
dicted  for  each  object  class,  and  (2)  measurements  observed  from  the  unclassified  object. 
This  is  particularly  true  for  the  classical  “kinematic”  or  translational  states  and  associated 
measurements  of  range,  pointing  angles,  and  associated  rates,  or  alternative  physical  space 
equivalents. 

But  physical  objects  are  inherently  six  degree- of- freedom  or  “6-DOF”  entities  -  that 
is,  to  describe  their  kinematics  or  motion  fully,  we  need  to  consider  states  (positions  and 
derivatives)  in  three  translational  and  three  rotational  state  subspaces.  A  proper  descrip¬ 
tion  would  in  general  consider  each  subspace  separately,  but  the  state  dynamics  for  many 
physical  objects  are  highly  coupled  across  the  subspace  boundaries.  For  any  particular  ob¬ 
ject,  the  nature  and  degree  of  this  coupling  are  generally  a  function  of  object  parameters 
that  can  be  directly  or  indirectly  expressed  in  the  state  dynamics  (mathematical)  model. 

For  example,  in  the  case  of  conventional  airplanes  (recall  Sect.  2. 3. 3.1),  rotational 
states  are  often  closely  coupled  to  translational  acceleration  states.  The  parameters  which 
govern  this  coupling  are  aircraft  class-specific  quantities  like  wing  surface  area,  coefficient 
of  lift,  mass,  and  so  on.  The  coupling  implies  that  rotation  state  measurements  could  be- 
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tray  much  regarding  future  kinematic  velocity  and  position.  Unfortunately,  the  kinematic 
position  measurements  to  which  most  object  trackers  are  limited  provide  little  observability 
of  rotation  state  (or  aspect  Jingle,  from  a  sensor-relative  perspective). 

Now  for  most  objects  of  interest,  translational  and  rotational  dynamic  states  are 
related  in  a  nonlinear  fashion,  but  often  can  be  linearized  for  modelling  and  prediction  in  a 
conventional  filter  structure.  If  we  could  measure  translational  and  rotational  states  directly 
for  some  class  of  objects  and  apply  these  measurements  to  an  appropriate  multiple  model 
set  of  conventional  filters,  we  might  expect  that  an  incorrect  combination  of  measurements 
and  filter  model  would  exhibit  high  residual  error,  where  the  true  and  filter  model  state 
dynamics  coupling  assumptions  differ.  This  high  residual  error  would  be  an  expression  that 
that  particular  filter  model  has  a  low  joint  likelihood  of  producing  the  observed  translational 
and  rotational  measurements.  The  residuals  could  then  be  combined  in  the  classical  way 
to  yield  an  a  posteriori  probability  of  class  membership. 

In  general,  it  is  very  simple  to  obtain  “kinematic”  or  translational  measurements 
that  are  linear  functions  of,  or  readily  linearized  with  respect  to,  translational  state  spaces. 
Unfortunately,  as  noted  above,  we  cannot  in  general  measure  “rotation  state”  directly  for 
remotely  observed  objects.  However,  feature  observables  or  signatures  are  generally  direct 
functions  of  aspect  angle,  and  their  measurements  potentially  contain  much  information 
about  rotational  states. 

The  problem  is  that,  in  typical  cases  of  interest,  conventional  estimators  often  cannot 
exploit  the  relationship  between  rotational  states  and  feature  observable  measurements. 
Simply,  it  is  frequently  impossible  to  make  a  reasonable  linear  or  linearized  prediction  of 
the  expected  feature  observable  measurement  from  current  knowledge  of  aspect  angle  or 
rotational  state.  Under  these  circumstances,  the  classical  approach  of  Therrien  [211]  or 
multiple  model  residual  analysis  cannot  be  used. 

First,  particularly  for  aircraft  targets,  true  aspect  angle  state  values  are  likely  to 
change  unpredictably  over  time.  In  a  classical  state  estimation  sense,  we  say  that  they 
have  high  “driving  noise  strength”  or  “Q”  (see  Sect.  2.3.1),  driven  directly  as  they  are  by 
unobservable,  unpredictable,  often  high  gain  operator  inputs  (e.g.,  roll/pitch  commands). 
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Second,  the  relationship  between  feature  observable  measurements  zf  —  h(x°)  +  v 
(relaxing  the  notation  from  Sect.  2. 3. 1.1  somewhat  for  clarity)  and  aspect  angle  state 
xa  may  be  poorly  understood  or  highly  nonlinear,  even  to  the  extent  that  the  partial 
derivative  dh(xa)/dxa  and  expected  measurement  zf  —  h(x°)  information  required  by 
linear  or  linearized  filters  (Eqn.  (2.18))  cannot  be  reasonably  defined. 

Even  if  the  relationship  is  known,  the  “deterministic”  measurement  component  h(x“) 
of  zf  (see  Eqn.  (2.15))  and  its  partied  derivative  may  vary  so  quickly  with  respect  to 
aspect  angle  state  that  reasonable,  otherwise  unobservable  changes  in  aspect  angle  render 
the  partial  derivative  and  expected  measurement  unpredictable.  In  other  words,  if  the 
unpredictable  aspect  angle  state  change  between  measurements  (a  function  of  Q  and  inter- 
measurement  time  interval)  is  likely  to  be  such  that  the  expected  measurement  h(x“) 
cannot  be  predicted,  or  the  variation  of  h(x°)  cannot  adequately  be  approximated  as 
linear  over  the  range  of  current  and  likely  future  x°  values,  then  conventional  “predictor- 
corrector”  filtering  won’t  work. 

This  is  precisely  the  situation  for  aircraft  targets  and  their  radar  signatures  (particu¬ 
larly  narrowband,  non-high  range  resolution)  due  to  high  aspect  angle  rates  and  scatterer 
interactions  as  described  in  Chapter  II.  We  can  illustrate  this  problem  for  any  notional 
feature  space  ( not  limited  to  radar),  as  shown  in  Fig.  3.1.  Suppose  your  state  location 
at  time  t0  is  somewhere  in  the  region  marked  xg,  but  at  the  next  measurement  time  tly 
when  your  state  may  lie  somewhere  in  the  region  marked  x“,  you  receive  a  measurement 
h(x“)  (a  function  solely  of  some  unknown  particular  x“).  As  a  conventional  filter  designer, 
your  first  questions  are  (1)  “what  value  of  dh(xa)/dxa  is  appropriate?”  and  (2)  “what 
is  my  expected  measurement  z{  =  h(x“)  at  fi?”.  We  quickly  observe  that,  due  to  the 
rapid  variations  of  h(x“)  as  a  function  of  x°  and  the  uncertainty  in  your  current  and  new 
state  locations,  it  is  clearly  impossible  to  define  a  meaningful  partial  derivative  value  or  an 
expected  measurement. 

The  conditions  put  forth  in  the  scenario  of  Fig.  3.1  are  particularly  stressing.  The 
next  section  will  examine  conditions  under  which  conventional  multiple  model  residual 
analysis  techniques  can  be  used  with  kinematic  and  feature  observable  information  for 
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Figure  3.1.  A  Measurement  Function  Unsuitable  for  Conventional  Filtering 

dynamic  object  /  target  recognition  -  without  obtaining  a  linear  predictive  model  for 
feature  observable  measurements. 

3-4  Conventional  Multiple  Model  Approaches  for  Dynamic  Object  Recognition 

In  this  discussion,  we  will  limit  ourselves  to  filter  state  dynamics  models  that  include 
coupled  translational  and  rotational  (or  aspect  angle)  states  only.  We  will  exploit  this 
coupling  for  multiple  model  residual  analysis  while  using  -  but  not  predicting  -  feature 
observable  measurements.  The  outcome  of  this  process  will  be  an  expression  for  the  joint 
likelihood  of  translation  state-related  or  “kinematic”  measurements  and  rotation  state- 
related  pseudo-measurements,  conditioned  on  feature  observable  measurements.  This  is 
Step  One  in  the  three  step  process  outlined  at  the  start  of  this  chapter. 

3-4  1  Required  Conditions  and  Estimator  Alternatives.  First,  suppose  that  a 
one-to-one  (see  App.  A)  mapping  exists  between  aspect  angle  space  and  some  feature 
observable  space,  and  is  usable  as  a  transformation  between  the  two  spaces,  for  a  partic¬ 
ular  object  of  class  “A”  (failing  the  one-to-one  requirement,  we  require  that  the  inverse 
mapping  is  of  “low  ambiguity”,  in  the  sense  of  Sect.  2.2.1).  This  mapping  is  said  to  pro¬ 
vide  a  pseudo-measurement  of  aspect  angle,  or  equivalently  a  pose  estimate,  as  defined  in 
Sect.  2.2.1.  For  the  correct  association  of  “kinematic”  (translation  state)  measurements 
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and  aspect  angle  pseudo-measurements  from  this  object  with  its  coupled  dynamics  filter 
model,  a  feature  observable  measurement  will  be  equivalent  to  an  aspect  angle  or  rota¬ 
tion  state  measurement  (noting  the  usual  relationship  between  aspect  angle  .and  rotation 
states  discussed  in  Sect.  3.2.2).  Moreover,  we  expect  that  the  residual  sequence  (i.e.,  from 
all  6-DOF  kinematic  measurements  and  pseudo-measurements)  for  this  correct  association 
should  indicate  high  joint  likelihood  of  the  observed  “measurements” . 

However,  suppose  that  the  aspect  angle-to-feature  observable  mapping  and/or  dy¬ 
namics  coupling  is  different  for  some  other  object  of  class  “B”.  In  this  case,  we  should  not 
expect  that  the  aspect  angle  pseudo-measurement  derived  by  mapping  feature  observations 
from  class  “A”  into  the  aspect  angle  space  of  class  “B”  will  lead  to  high  likelihood  (small) 
residuals  from  the  filter  model  corresponding  to  class  “B”. 

Thus,  we  have  a  set  of  conditions  under  which  the  joint  likelihood  of  all  6-DOF 
kinematic  events  (translational  and  rotational  measurements  or  pseudo-measurements)  can 
be  assessed,  conditioned  on  prior  kinematic  and  feature  observable  measurements,  without 
explicitly  considering  the  likelihood  (or  probabilistically- weighted  distance  in  feature  space) 
of  the  feature  observable  measurements  per  se.  Restated,  those  conditions  are  (1)  existence 
and  availability  of  a  low-ambiguity  aspect  angle-to-feature  observable  mapping  for  each 
class,  (2)  differences  in  this  mapping  for  different  object  classes,  and  (3)  differences  in  the 
state  dynamics  coupling  for  different  classes.  Condition  (1)  must  apply  in  every  case,  but 
either  or  both  of  (2)  and  (3)  must  apply  only  to  the  extent  required  to  obtain  distinctly 
different  residual  sequences  or  other  evidence  of  incorrect  associations  for  objects  and 
behavior  of  interest. 

Observe  in  passing  that  the  aspect  angle-to-feature  observable  mapping  need  neither 
be  linear  nor  “onto”  (see  App.  A)  the  entire  feature  observable  space  (i.e.,  the  mapping 
must  only  be  onto  the  range  of  the  transformation  for  each  class).  The  point  here  is 
that  a  nonlinear  mapping  can  still  give  us  the  one-to-one  condition  that  we  require,  and 
a  mapping  that  is  not  “onto”  the  feature  space  allows  us  to  reject  outright  the  need  to 
investigate  a  given  class  which  cannot  map  from  its  aspect  angle  space  to  the  measured 
feature  observable  value. 
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For  tactical  targets,  the  Kendrick-Maybeck  /  Andrisani  /  Sworder-type  “kinema¬ 
tic/aspect  angle”  estimators  provide  exactly  the  sort  of  coupled  dynamics  filter  model 
that  we  seek.  They  use  “kinematic”  (translation  state)  measurements  in  conventional 
linear  ways,  and  ignore  the  detailed  relationship  between  true  aspect  angle  and  feature 
observable  measurements  by  allowing  the  estimator  to  accept  only  a  pseudo-measurement 
of  aspect  angle,  generally  from  some  pose  estimator,  rather  than  the  feature  observable 
measurement  itself.  In  the  following  section,  we  consider  exactly  how  a  kinematic /aspect  - 
angle  filter  would  be  used  in  a  multiple  model  estimator  for  object  recognition. 

3. 4-2  Kinematic/ A  sped-  Angle  Estimators  for  Recognition.  We  assume  that  con¬ 
ventional  measurements  of  object  position  in  three  space  and  relative  (doppler)  velocity 
along  the  object-sensor  vector  are  available  at  discrete  times  (measurement  intervals). 
Moreover,  we  assume  that  pseudo-measurements  of  object  aspect  angle  are  available  at 
the  same  times,  say  in  the  form  of  three  independent  Euler  angles  relative  to  some  vector 
frame  of  reference.  In  the  nomenclature  of  Section  2.3. 1.3,  this  defines  a  seven-state  zt 
vector,  which  provides  an  update  to  each  of  J  kinematic/ aspect-angle  filters  (one  for  each 
candidate  object  class)  at  each  measurement  event. 

Recall  that  the  pseudo-measurements  of  object  aspect  angle  z“  are  in  fact  based 
on  aspect  angle  or  pose  estimates  xa  (the  “hat”  denoting  an  estimate).  In  general,  we 
have  za  =  h(x“),  with  x°  =  argx0{max[p(z/  (  x°,  w,)]}.  Depending  on  the  representation 
for  object  rotational  states  in  the  kinematic/aspect-angle  filter,  the  function  h  and  filter 
measurement  matrix  H  =  c?h(x°)/5xa  are  at  most  simple  coordinate  transforms,  and 
can  be  identity  operations.  Thus,  in  fact,  the  aspect  angle  pseudo-measurement  contains 
information  provided  by  feature  observables  over  time,  or  Z* .  The  measurement  covariance 
for  the  pose  estimate  is  a  function  of  the  a  priori- characterized  performance  for  the  pose 
estimator  of  each  object  class,  for  the  given  ambient  conditions. 

The  object  position  and  velocity  measurements  (common  to  all  object  classes)  and 
aspect  angle  pseudo-measurements  (in  general,  unique  to  each  class)  are  provided  to  the 
appropriate  set  of  kinematic/aspect-angle  filters  -  one  filter  for  each  object  class.  In 
turn,  the  filters  generate  translation  state-  and  rotation  state-related  (i.e . ,  coupled  6-DOF 
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kinematics)  residuals  and  covariances.  We  can  now  follow  Sect.  2. 3. 1.3  to  define  an  object 
classifier  using  residual  analysis  for  “kinematic”  (translation)  measurement  and  “aspect 
angle”  pseudo-measurement  residuals  in  the  following  Bayesian  structure: 


P(“i 


z  l,zd 


)  = 


{riL.ipW  i  ZV.X-., ■*)]}?(*-■) 


(3.1) 


where: 

Z{  =  a  set  of  k  feature  observable  measurements  zf 

Zf+,  =  a  set  of  k  -f-  s  “kinematic”  (translation  state)  measurements  zd,  where  s  >  1 
in  general  (representing  kinematic  measurements  which  precede  the  k  feature  observable 
measurements  -  to  be  discussed  in  Sect.  3.6.5) 

z£“  =  a  vector  formed  by  juxtaposing  “kinematic”  measurements  zd  and  aspect  angle 
(pseudo)  measurements  z°  taken  or  available  at  time  tn 

and  other  variables  are  as  defined  earlier. 


An  immediate  extension  to  this  concept  is  to  allow  “state  reasonableness  monitoring” 
as  discussed  in  Sect.  2. 3. 1.3.  Under  this  construct,  our  Bayesian  expression  is  of  the  form: 


p(u>i 


z£X+ 


.)  = 


{nLi[p(zn’Q>*n  i  Z^+,_1,Z^_1,u;,)]}p(a;,) 

{nLi [p(*dna,K  I  Zj[+.-i,ZLi,«i)]}p(«i) 


(3.2) 


where: 

xj,  =  an  estimated  state  or  states  which  we  wish  to  compare  to  anticipated  values 
for  each  model  u;,-  at  each  time  tn 

and  other  variables  are  as  defined  earlier. 

3-4-3  Summary.  Given  a  pose  estimate  from  a  feature  observable  sensor  and 
associated  pose  estimators,  then,  Kendrick  et  al.- type  filters  in  a  multiple  model  structure 
provide  a  natural  tool  for  residual  analysis  and  Bayesian  estimation  of  object  class  mem- 
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bership,  considering  the  joint  likelihood  of  object  behavior  with  respect  to  kinematic  states 
and  measurements  that  the  filter  is  well-suited  to  process.  Chapter  IV  will  propose  a  suit¬ 
able  estimator  model  for  implementing  these  equations,  and  show  how  residual  modeling 
and  state  reasonableness  checking  betray  an  incorrect  object-model  association. 

This  straightforward,  conventional  multiple  model  construct  provides  Step  One  in 
the  three-step  process  discussed  at  the  start  of  the  chapter.  The  second  (most  challenging) 
and  third  steps  remain  to  be  solved.  In  the  following  section,  we  pursue  an  alternative  ap¬ 
proach  to  object  recognition,  which  will  lead  to  Step  Two.  Those  results  will  be  integrated 
ultimately  with  conventional  multiple  model  techniques  to  form  a  new  class  of  estimators  - 
satisfying  Step  Three. 

3.5  Comparing  Observed  and  Expected  Feature  Observables  Without  Linear  Prediction 

Recall  from  Sect.  3.4.1  that  necessary  conditions  (2)  and  (3)  for  the  use  of  kine¬ 
matic/aspect  filters  in  object  recognition  required  differences  in  the  feature  observable- 
to-aspect  angle  mapping  for  different  objects,  and/or  differences  in  the  state  dynamics 
coupling  for  different  classes.  The  reader  should  note  that  these  conditions  are  not  suffi¬ 
cient  for  the  development  of  residual  differences  between  correct  (true)  and  incorrect  filter 
models.  That  is,  we  may  conceive  of  some  true  object  and  trajectory  which  yield,  from 
some  feature  observable  domain,  pose  estimates  that  are  reasonable  for  both  correct  and 
(one  or  more)  incorrect  object  library  classes,  given  the  observed  kinematics. 

Kinematically  then,  even  with  a  pose  estimate,  the  two  classes  are  ambiguous,  and  to 
distinguish  them  we  may  need  to  look  also  at  matching  distances  in  the  feature  observable 
space.  Recall  that  the  pose  estimate  is  an  angle  value,  and  it  says  nothing  about  the 
closeness  of  measured  and  library  signatures  in  feature  observable  space  -  a  single  signature 
could  yield  reasonable  pose  estimates  for  two  different  library  classes,  even  though  the 
matching  distance  in  feature  observable  space  is  much  greater  for  one  library  class  than  for 
the  other. 

Anticipating  the  need  to  recognize  dynamic  objects  with  ambiguous  feature  observ¬ 
able  spaces,  however,  as  implied  in  the  introduction  to  Sect.  3.3,  we  are  still  not  confident 
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to  make  a  predictive  estimate  (linear  or  otherwise)  of  the  next  feature  observable  measure¬ 
ment,  for  comparison  to  the  observed  measurement.  What  other  alternatives  do  we  have 
for  comparing  the  observed  feature  observable  measurement  to  possible  values  for  each 
object  class? 

First,  for  a  discretized  aspect  angle  space  xa,  a  typical  decision  theoretic,  parametric 
pose  estimator  for  object  class  a may  provide  the  value  maxx»  {p(z}  |  xa,u>j)}  -  that  is, 
the  likelihood  of  observing  a  single  feature  observable  value  zs  at  the  aspect  angle  value  x° 
(the  pose  estimate)  where  that  likelihood  is  maximized.  Note  the  use  here  of  a  lower  case  z, 
referring  to  a  measurement  at  a  single  discrete  time.  For  a  statistically  well-behaved  pose 
estimator,  that  is,  one  exhibiting  low,  unbiased  ambiguity  with  respect  to  aspect  angle  for 
the  correct  object-library  association,  this  output  likelihood  may  be  a  useful  metric  for 
comparison. 

In  general,  however,  comparing  these  values  for  two  different  object  classes  and 
u>j  is  a  questionable  approach  to  comparing  distances  in  feature  observable  space,  because 
each  such  value  represents  only  a  maximum  likelihood  value  derived  from  one  aspect  angle 
state,  rather  than  a  maximum  a  posteriori  estimate  found  by  considering  all  possible  states 
x“.  Under  circumstances  where  the  likelihood  of  a  given  feature  observable  measurement 
changes  rapidly  with  aspect  angle  -  as  seen  in  Fig.  3.1  -  use  of  the  likelihood  from  the  pose 
estimator  is  decidedly  suboptimal.  Moreover,  in  these  cases  we  expect  that  a  classical  pose 
estimate  history  -  considering  each  feature  observable  measurement  independently  -  will 
be  very  “noisy”,  exhibiting  random  changes  in  aspect  angle,  perhaps  even  for  the  correct 
object-library  association.  How  can  we  derive  an  accurate  aspect  angle  estimate  for  the 
(unknown  a  priori)  correct  class  in  these  cases? 

To  address  these  issues,  we  will  develop  the  methodology  of  Larson  and  Peschon  for 
use  in  object  recognition.  It  will  be  seen  that  the  Larson  and  Peschon  approach  is  eminently 
applicable  for  dealing  with  aspect  angle  states  and  ambiguous  feature  observables,  provided 
that  we  have  some  basic  knowledge  about  the  “a  priori'  likelihood  of  transitions  in  the 
aspect  angle  space  xa. 
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Unlike  conventional  estimators,  the  Larson  and  Peschon  estimator  does  not  create 
one  predicted  state  vector  and  corresponding  predicted  measurement  for  comparison  with 
the  next  measurement.  Rather,  the  Larson  and  Peschon  estimator  evaluates  all  reasonable 
new  state  vectors  (candidate  locations),  considering  the  likelihood  of  observing  the  new 
measurement  at  each  candidate  location,  and  the  likelihood  of  being  at  that  new  location 
given  (1)  the  likelihood  of  being  at  some  previous  state  location,  and  (2)  the  likelihood  of 
moving  from  that  previous  location  to  the  current  candidate  location.  The  use  of  dynamic 
programming  provides  a  method  for  performing  these  calculations  without  resorting  to  an 
exhaustive  search  covering  all  state  locations  over  all  time. 

Finally,  as  we  have  seen,  the  output  of  the  Larson  and  Peschon  equations  is  naturally 
a  measure  of  the  maximum  joint  likelihood  of  observed  events,  conditioned  on  a  priori  in¬ 
formation.  For  this  reason,  it  will  fit  naturally  into  our  desire  to  exploit  the  joint  likelihood 
of  observed  events,  allowing  us  to  work  with  aspect  angle  and  feature  observable  spaces 
that  cannot  be  treated  with  conventional  linear  estimation  techniques. 

Sect.  3.6  develops  the  Larson  and  Peschon  methodology  in  detail  for  object  recogni¬ 
tion.  We  have  seen  in  Sect.  2.4.2  that  the  Larson  and  Peschon  equations  are  a  particular 
form  of  forward  dynamic  programming  sequence  comparison  -  this  observation  will  lead 
naturally  to  the  consideration  of  classical  dynamic  programming  sequence  comparison  and 
other  sequence  comparison  methods  for  dynamic  object  recognition. 

Development  of  these  forward  dynamic  programming  sequence  comparison-based 
techniques  for  dynamic  object  or  target  recognition  will  accomplish  Step  Two  of  the  three 
step  process  outlined  at  the  start  of  this  chapter.  This  step  will  be  accomplished  by  looking 
at  differences  between  objects  in  feature  observable  space  only,  but  considering  likely  kine¬ 
matics.  That  is,  we  will  use  “kinematic”  (i.e.,  translation  state-derived)  measurements, 
with  knowledge  of  the  state  dynamics  translation/rotation  coupling  for  different  object 
classes,  to  restrict  our  choices  for  matching  feature  observable  measurements  to  feature 
observable  libraries  for  each  class  according  to  likely  dynamics.  The  result  will  be  an  ex¬ 
pression  for  the  joint  likelihood  of  measured  feature  observables  for  each  class,  conditioned 
on  kinematic  measurements. 
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This  process  will  not  require  differences  in  the  translation/rotation  state  coupling 
between  classes.  However,  as  a  consequence  of  this  process,  a  high  quality  pose  estimate 
will  be  obtained  for  each  class,  even  for  noisy  feature  observable  spaces.  This  fact  will 
allow  us  to  combine  kinematic/aspect  recognizers  with  forward  dynamic  programming- 
based  recognizers  for  use  in  ambiguous  feature  observable  spaces.  This  combination  will 
accomplish  Step  Three  of  the  three  step  process.  The  result  is  a  new  form  of  estimator. 

The  following  section  develops  extensions  to  the  Larson  and  Peschon  equations  for 
object  recognition.  Subsequent  sections  consider  other  forward  dynamic  programming  se¬ 
quence  comparison-based  approaches  -  both  new  forms  developed  in  this  research  and  re¬ 
lated  concepts  developed  by  other  researchers.  Performance  of  these  recognizers  is  demon¬ 
strated  in  Chapter  V. 

3.6  Applying  the  Larson  and  Peschon  Approach  in  Recognition 

The  purpose  of  this  section  is  to  apply  Bayes’  Rule  [153, 197],  the  Larson  and  Peschon 
methodology  (see  Sect.  2.4.4  and  [133])  and  aspect  angle  state  transition  information  given 
by  kinematic  state  estimates  (i.e.,  6-DOF  kinematic  state  estimates,  possibly  based  on 
translational  measurements  only:  see  Sect.  5.5),  to  provide  a  representation  for  the  a 
posteriori  probability  of  class  membership  p(o>,  |  Z{,  Zjjj  for  each  class  u\,  or  a  reasonable 
estimate  thereof.  In  this  development,  the  new  information,  in  a  Bayesian  sense,  is  given 
by  the  feature  observable  measurements.  The  a  priori  information  is  given  by  kinematic 
measurements  and  prior  knowledge  of  class  probability  p(wj). 

This  effort  will  achieve  Step  Two  of  the  three-step  process  outlined  at  the  start  of 
this  chapter,  providing  an  estimate;  structure  that  handles  dynamic,  nonlinear  feature 
observable  /  aspect  angle  relationships  in  a  straightforward  Bayesian  fashion.  The  devel¬ 
opment  in  this  section  will  show  that,  in  general,  calculation  of  the  a  posteriori  probability 
pK  I  z£,z-)  requires  an  exhaustive  search,  but  that  use  of  the  Larson  and  Peschon 
approach  allows  one  feasibly  to  define  a  “best  estimate”  for  that  quantity.  Recalling  the 
discussion  in  Sect.  2.6.2,  we  will  observe  in  Sect.  3.8  that  this  Larson  and  Peschon  approach 
to  fusion  of  kinematic  and  nonkinematic  or  feature  observable  information  provides  a  more 
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optimal  approach  for  restriction  of  the  domain  of  a  feature  observable-matching  likelihood 
function  than  prior  efforts  by  Le  Chevalier  and  Mieras,  as  discussed  in  Sect.  2.5. 

The  reader  should  note  that  the  efforts  discussed  here  axe  believed  to  be  both  original 
and  significant.  The  contributions  are: 

(1)  The  first  application  of  the  full  Larson  and  Peschon  approach  to  dynamic  object  recogni¬ 
tion  (as  opposed  to  previous  suboptimal  applications  of  the  Larson  and  Peschon  equations, 
or  functionally  similar  approaches,  previously  described  in  Sect.  2.5,  and  further  compared 
in  Sect.  3.8,  to  follow). 

(2)  The  first  known  theoretical  procedure  for  obtaining  maximum  a  posteriori  (MAP)  prob¬ 
ability  of  class  membership  based  on  feature  observable  and  kinematic  measurements,  and 
a  priori  probability  of  class  membership,  where  linear  (or  linearization-based)  estimation 
techniques  cannot  be  used. 

(3)  Suboptimal  but  practical  developments  of  contribution  (2)  using  outputs  of  the  Larson 
and  Peschon  equations  to  approximate  the  proper  a  posteriori  probability  to  any  desired 
confidence  level. 

3.6.1  Facts  and  Assumptions.  First,  we  confine  our  attention  to  a  set  of  J  a 
priori  known  object  classes  each  represented  by  a  hypothetical  aspect  angle  sphere 
(see  Figs.  1.2  and/or  2.6)  or  model  having  appropriate  feature  observable  distributions 
associated  with  each  aspect  angle  value.  Now,  for  any  given  problem  (i.e.,  any  given  set  of 
discrete  measurements  {Z{,Z£,}  over  some  time  interval),  we  can  restrict  our  concern  to 
a  given  aspect  angle  region  on  each  of  the  object  spheres  -  that  is,  we  assume  a  negligible 
probability  that  the  class  has  presented  aspect  angles  outside  this  region  over  the  duration 
of  the  time  interval  corresponding  to  measurements  Z( . 

The  aspect  angle  region  can  be  thought  of  as  the  union  of  a  set  of  aspect  angle 
“windows”,  each  of  which  is  a  “sub-region”  of  feasible  aspect  angle  corresponding  to  a 
particular  measurement  zf  in  the  sequence  Z{.  Fig.  3.2  shows  how  these  windows  relate 
to  the  region  as  a  whole.  The  vertical  separation  between  windows  is  for  clarity  only. 
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Figure  3.2.  Aspect  Angle  Windows  and  the  Nominal  Aspect  Angle  Path  -  All  Algorithms 
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Note  that  the  time  interval  corresponding  to  kinematic  measurements  will  over¬ 
lap  completely  and  contain  the  time  interval  corresponding  to  the  feature  observable  mea¬ 
surements.  For  reasons  to  be  clarified  in  Sect.  3.6.5,  in  contrast  to  the  k  +  s  kinematic 
measurements  discussed  in  Sect.  3.4.2,  the  m  kinematic  measurements  discussed  here  may 
in  general  include  measurements  taken  after  as  well  as  before  the  k  feature  observable 
measurements.  Also,  due  to  differences  in  angles  of  attack,  etc.,  for  given  maneuvers  from 
different  classes,  the  regions  or  windows  may  not  be  identical  from  class  to  class.  Given 
some  “granularity”  or  discretization  of  continuous  aspect  angle  into  “cells”  on  the  object 
spheres  (assumed  identical  for  spheres  of  all  object  classes),  each  cell  is  considered  a  state 
xa. 

Now,  define  the  super-region  Xa  as  the  (super-)  set  of  all  aspect  angle  cells  or  states 
that  belong  to  the  region  of  consideration  for  at  least  one  object  class,  a  total  of  say  N, 
cells  or  states  in  number.  Any  set  of  k  +  1  aspect  angle  cells,  or  aspect  angle  state  his¬ 
tory,  corresponding  for  analysis  purposes  to  discrete  cell  locations  at  feature  observable 
measurement  times  along  an  aspect  angle  path  which  yields  the  k  feature  observable  mea¬ 
surements  Z{  =  {z{,Z2,Z3, . .  .,z{},  will  be  denoted  X£  =  {x|5,x“,x2,x£, . . . , x£}  (where 
x,“  is  an  a  priori  or  starting  state  and  the  other  k  states  correspond  one-for-one  to  the 
feature  observable  measurements  Z{).  Note  that,  as  do  Larson  and  Peschon,  we  choose 
not  to  make  a  measurement  at  time  t0 ,  corresponding  to  state  Xq  -  this  distinction  provides 
a  clean  boundary  between  a  priori  information  and  new  information. 

The  (finite)  number  Np  of  possible  such  aspect  angle  paths  through  Xa  is  given  by 
the  standard  computation  for  the  number  of  permutations  of  N ,  things  taken  k  4-  1  at  a 
time,  with  replacement,  or  ( N,)k+1  paths.  We  will  denote  the  set  of  all  such  paths  as  Xpk. 
Henceforth,  this  development  will  use  notation  somewhat  different  from  that  of  Larson 
and  Peschon  by  referring  to  a  particular  “nth”  path  of  k  +  1  aspect  angle  states  as  X£  n 
(consistent  with  the  notation  u,  referring  to  an  ith  object  class). 

Clearly,  from  the  definition  of  Xa,  some  of  these  paths  X*  n  are  of  negligible  prob¬ 
ability  for  one  or  more  object  classes,  because  these  paths  fall  outside  the  subsets  of  Xa 
appropriate  for  those  classes.  Other  paths  are  of  negligible  probability  for  all  classes  be¬ 
cause  they  correspond  to  kinematically  unlikely  or  “impossible”  aspect  angle  paths.  Since 
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we  are  dealing  with  a  finite  number  of  aspect  angle  cells,  we  will  assign  probabilities  of 
starting  states  and  state  transitions,  rather  than  the  probability  densities  (arising  from 
discretization  of  continuous  space)  as  discussed  by  Larson  and  Peschon  -  recognizing  that , 
in  general,  in  the  limit  as  discretization  becomes  finer  and  the  number  of  states  approaches 
infinity,  the  distinction  is  irrelevant. 

We  will  see  that  applying  the  Larson  and  Peschon  equations  (with  reasonable  modifi¬ 
cations  for  object  recognition)  for  any  one  model  gives  the  particular  state  history  Xfn, 
call  it  X-ZJl.  (i.e.,  the  “Larson  and  Peschon”  estimate  of  the  aspect  angle  path  for  model 
(Vi),  which  maximizes  the  conditional  probability  p(X£n  |  Z{,  Zfn,<Vi).  With  appropriate 
modifications,  we  will  be  able  to  find  the  joint  conditioned  probability  p(X£  n, u>i  \  Z[ ,  Z^), 
which  we  will  sum  over  all  possible  X*  n  to  find  the  quantity  that  we  desire,  p(u»j  |  Z{,  Z^). 
Clearly,  it  is  no  longer  sufficient  to  ask  what  aspect  angle  path  X£  n  we  are  on  -  also  we 
must  ask  over  which  object  model  <Vi  this  path  is  traced.  The  major  point  of  this  section 
is  to  understand  the  relationship  between  (1)  the  maximum  likelihood  path  found  by  the 
Larson  and  Peschon  approach,  i.e.,  X*£,.  for  a  particular  u\,  and  a  joint  conditioned  prob¬ 
ability  associated  with  that  path,  and  (2)  the  information  that  we  want,  p((v{  |  Z{,Z?n). 

Further  assumptions  are: 

(1)  Following  Larson  and  Peschon,  assume  that  feature  observable  measurement  zf  is 
independent  of  aspect  angle  state  xf  and  z[  for  tj  /  tt.  As  discussed  in  Sect.  2.4.4,  this 
assumption  is  open  to  question  for  real  world  objects  and  sensors,  and  is  readily  relaxed, 
at  the  risk  of  added  dimensionality  (i.e.,  computation). 

For  example,  suppose  we  wish  to  determine  the  likelihood  of  a  feature  measurement 
z{  originating  not  simply  from  one  aspect  angle  state  x“,  but  originating  from  a  set  of  states 
xf_l  to  xf,  passed  over  between  discrete  measurement  times  tt_i  and  tt.  This  can  occur  in 
high  range  resolution  (HRR)  radar  when  returns  from  individual  pulses  are  summed  over  a 
short  time  period  to  create  a  single  sweep  or  measurement,  if  the  aspect  angle  is  changing 
over  that  time.  As  discussed  in  Sect.  2.2.3,  it  is  common  practice  to  create  one  HRR  sweep 
for  measurement  by  summing  several  dozen  pulses  over  a  period  of  much  less  than  one 
second.  Determination  of  a  measurement  likelihood  is  feasible  in  this  case  also,  but  now 
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the  likelihood  “p( zf  |  [xf_ln  — *  x“„],u>i)”  is  path  dependent,  and  many  more  cases  will 
have  to  be  considered.  For  the  demonstrations  pursued  in  Chapter  V,  this  issue  will  not 
be  addressed  -  all  feature  observable  measurements  are  assumed  to  have  been  generated 
at  a  single  aspect  angle  location. 

Finally  with  respect  to  the  issue  of  measurement  independence,  we  note  in  passing 
that  each  will  ideally  include  measurements  from  stochastically  independent  sensors  and 
feature  spaces.  This  condition  will  provide  further  reduction  in  ambiguity  between  object 
classes. 

(2)  It  should  be  clear,  and  kept  in  mind  during  this  development  that,  for  any  /,  p(Z/  |  Z/_1) 
=  p(z/  I  Z/-i),  and  analogously  that  p(X,“„  |  Xf_l  n)  =  p(x£n  |  X“_l  n),  as  in  Larson  and 
Peschon’s  development. 

(3)  We  assume  that  the  aspect  angle  state  (cell)  transition  likelihood  p(x“+1  n  |  x“n,  Zfn,ujt) 
and  probability  of  starting  cell  location  p(x“n  |  Z^,  are  given  by  some  a  priori  knowl¬ 
edge  about  the  likelihood  of  transitions  in  the  aspect  angle  space.  We  will  see  later  in 
this  chapter  that  there  are  a  wide  variety  of  possible  sources  for  this  “a  priori ”  knowledge 
in  object  recognition  scenarios  -  we  will  consider  various  combinations  of  kinematic  and 
kinematic/aspect  tracking  filters  and  smoothers,  as  discussed  in  Sect.  2.3.  For  now  we 
will  simply  note  that  this  information  will  be  provided  primarily  by  the  observed  object 
translational  kinematics  Zf„,  for  a  given  assumption  of  object  class. 

It  is  extremely  important  to  note  that  the  transition  likelihood  p(x“+l  n  |  x“„,  Z^,a>,), 
fed  from  external  sources  in  this  way,  provides  a  natural  path  for  optimal  sensor  fusion. 
The  recognition  and  use  of  that  fact  is  a  significant  contribution  of  this  research,  and 
also  distinguishes  this  concept  from  previous  efforts  that  can  be  derived  as  suboptimal 
implementations  of  the  equations  developed  here.  This  distinction  will  be  addressed  in 
sections  to  follow  (principally  Sect.  3.8.1). 

(4)  We  assume  that  p(a/,)  (a  priori )  is  known  for  each  object  class  u>,,  and  furthermore 
that  p(u>i)  =  p(wj  |  Z^),  that  is,  that  the  kinematic  measurements  and  derived  kinematic 
state  history  provide  no  information  as  to  the  nature  of  the  object.  This  last  assumption 
is  clearly  neither  true  nor  desirable,  particularly  for  aircraft  recognition  when  character- 
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istic  trajectories  for  various  aircraft  types  are  classified  probabilistically.  However,  in  this 
development,  we  wish  to  assess  recognition  improvement  due  to  DP  sequence  comparison 
methods  only,  so  all  objects  Eire  considered  equally  likely  to  have  executed  the  observed 
kinematic  transitions. 

In  the  event  that  a  particular  set  of  kinematic  measurements  Z ^  can  be  associated 
with  a  particular  event  (maneuver),  and  likelihoods  of  this  event  p(Z£,  |  u>j)  are  known 
for  each  object  class,  another  straightforward  application  of  Bayes’  Rule  will  allow  one  to 
estimate  p(u/j  |  Z^)  explicitly  as  shown  in  Eqn.  (3.3)  below.  This  quantity  then  replaces 
p(o;j)  in  the  subsequent  development. 


p(u>i 


K) 


1  qQp(a>i) 

Ei,i=1p(zil^)p(^) 


(3.3) 


(5)  For  high  range  resolution  radar  feature  observables,  we  assume  that  uncertainties  in 
range  bin  alignment  and  scale  factor  are  handled  by  finding  MAX{  p(zf  |  x0,^)}  for  any 
combination  of  feature  observable  measurement  zf  and  trial  aspect  angle  state  x“  on  any 
model  Wj,  essentially  following  the  “maximum  likelihood”  method  discussed  by  Weiss  and 
Friedlander  [222],  and  used  by  Bamiv  [8,  13]  and  others  [166]. 


Note  that  this  maximum  likelihood  approach  should  be  used  with  caution  for  the 
same  reason  that  we  hesitated  to  use  the  quantity  maxxc{p(zJ'  |  x°,u\)}  in  Sect.  3.3  -  for 
“exotic”  (i.e.,  nonsymmetric,  multi-modal,  etc.)  probability  densities,  maximum  likelihood- 
based  estimates  may  give  significantly  different  answers  from  those  given  by,  for  example, 
an  ensemble  weighted-average  probability  density  (i.e.,  weighted  over  all  equally  possible 
range  alignments,  in  this  case).  Similar  maximum  likelihood  assumptions  are  often  made 
for  feature  observables  other  than  HRR  radar. 


3.6.2  Relating  Path  Probability  p(X£  n  |  Zfk,  Z^,,  w.)  to  Object  Class  Probability 
p(w i  |  Zj(,Z^).  Using  our  current  variable  notation,  recall  that  Larson  and  Peschon 
sought  the  state  history  or  path  X*  n  to  maximize  p(X£  n  |  Z{)  in  some  general  state 
space.  Analogously  in  our  case,  trying  to  find  a  “best”  path  in  aspect  angle  space  over 
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some  model  u>,,  we  might  seek  a  state  history  to  maximize p(X£„  |  Z{,Zfn,Ui).  Following 
Larson  and  Peschon’s  approach,  we  break  this  expression  out  as: 


p(X*%  I  Z'.Z^u,,) 


t  ,  _  P&t  I  Xg,n,Z{_1>Zi>0 >i)p(Xln  1  Zj^Z^i) 


p(zft  I  zLi.z^i) 


(3.4) 


or,  equivalently,  by  the  assumptions  above: 


/-ya  I  Tjj  rjd  x  _  P(Z*  I  Xln^.)KXt,n  I  t*>i)  p(Xg_1)W  |  Zj^Z^OJj) 

Pv^Jb,n  I  /  /  [  r*/  \ 

PK  I  Zi-nZm.^.) 

(3.5) 

where  the  denominator  term  is  given  in  the  usual  fashion  by  summing  the  numerator 
expression  over  all  X£  n: 

p(*£  I  z{_!,z = 

Nr 

E  P(z*  I  *2,..«*)  PK»  I  P(Xt“_ljn  I  ZjU  Z‘ ,**)  (3.6) 

X“  „,n=l 


Each  element  in  the  original  Larson  and  Peschon  equations  has  a  counterpart  in 
this  object  recognition  application.  Thus  for  any  given  object  model  a/*,  we  can  con¬ 
ceptually  use  the  Larson  and  Peschon  approach  to  find  the  path  X£  which  maximizes 
p(X£n  |  Z{,  Z^,u>,).  Note  that,  whereas  the  Larson  and  Peschon  approach  started  with 
an  a  priori  probability  density  p(x{5),  the  procedure  described  here  would  start  with  an  a 
priori  probability  p(x“n  |  Z£,«j). 

This  process  is  illustrated  in  Figs.  3.3  and  3.4,  using  am  abbreviated  notation  for  clar¬ 
ity.  Note  that  in  these  figures,  allowable  starting  states  are  restricted  to  an  initial  window 
in  the  aspect  angle  region,  and  allowable  transitions  are  restricted  to  fall  from  one  window 
to  the  next.  This  reflects  an  a  priori  judgment  that  other  possibilities  are  of  negligible 
probability.  Fig.  3.3  shows  how  transitions  from  window  to  window  for  one  path  (arbitrar¬ 
ily  selected  from  the  two  sample  paths  shown)  might  appear  as  projected  onto  the  entire 
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Figure  3,3.  Typical  Matching  Paths  -  Larson  and  Peschon  Algorithm 
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allowable  aspect  angle  region.  Note  that  the  joint  probability  density  “p(Z(,  X“)”  found 
in  Fig.  3.4  is  simply  a  shorthand  notation  (i.e.,  not  showing  all  appropriate  conditioning 
variables)  for  the  product  of  the  numerator  terms  in  Eqns.  (3.4)  or  (3.5),  when  k  —  4. 

So  far,  the  discussion  has  been  concerned  with  defining  Larson  and  Peschon-like 
conditioned  probabilities  for  the  aspect  angle  space  corresponding  to  one  object  class  ojx. 
Now,  we  will  modify  the  Larson  and  Peschon  equations  to  allow  us  to  consider  the  a 
pr  >  probability  of  class  membership  p(u>i),  and  ultimately  to  define  the  a  posteriori 
probabilities  p(ujj  ]  Z[,  Z^)  that  we  seek.  Assuming  (with  reservations  as  discussed  above) 
that  p( u/i)  =  p(oj i  |  Zfn),  we  start  with  the  a  priori  probability  p(u\)  and  then  multiply  by 
p(x“n  |  Zj^jU/j),  and  continue  as  in  the  Larson  and  Peschon  development,  to  obtain  in  an 
analogous  fashion: 


/va  ,  „J  rrd  \  P(Zl  I  X?,«>Wi)P(XJ,n  I  ZL  «i)  P(X2-  1>n,  I 

P{  k""' ' 1  J  =  M  IzL.zi) 

(3.7) 


where  the  denominator  term  is: 


p( izLj.zl)- 


Np 


53  53  I  Xk,n,Vj)  P«n  I  X*-l,n>  P(X*- l,n>  <*>j  I  ZiLl>Z™)]  (3-8) 


Now  sum  Eqn.  (3.7)  over  all  possible  Xj|  n  for  any  given  o>,  to  obtain: 

s„ 

P(*i  |Z{,Z-)=  £  p(X‘„,u;,|Z',Z£)  (3.9) 

x:.»'n=1 

Thus,  the  desired  p(u>j  |  Z{ ,  Z£j  can  be  found  rigorously  only  by  keeping  track  of,  and 
performing  appropriate  calculations  for,  all  possible  aspect  angle  paths  over  all  possible 
object  models  -  that  is,  all  X£  n  in  X°k  over  all  u»,.  In  the  following  section,  we  will  repeat 
these  calculations  using  Jin  approach  closer  to  the  actual  algorithm  used  by  Larson  and 
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Peschon,  and  then  we  will  consider  modifications  that  might  be  desired  to  provide  more 
rigorous  answers,  if  the  basic  assumptions  of  this  development  do  not  apply. 


3.6.3  Relating  I*(xk,k)  to  Object  Class  Probability  p(uii  |  Z(,  Z^J.  Recall 
that  the  “practical  version”  of  the  Larson  and  Peschon  method  as  implemented  on  dif¬ 
ferent  object  models  would  find  the  Xkn,  say  X^.,  for  each  that  maximizes 
p(X£n  |  not  by  maximizing  this  conditional  probability  directly,  but  rather 

by  maximizing  7*(xjJ,  A),  where  (using  the  original  Larson  and  Peschon  form,  but  recalling 
that  in  our  development  for  object  recognition,  till  probabilities  would  be  conditioned  also 
on  Zfn  and  u>,): 

MAX  r (xj ,  k)  =  MAX  [p(z{  I  xl)  p(xak  \  xak_t)  J*  K_x,  k  -  1)]  (3.10) 

x*  x* 

or,  in  our  form: 

MAX  /*(x£  k  |  u>i)  = 

x?,« 

MAX  [p(zj[  I  x;tW,«j)  p(x“n  I  xl_lin,Zi,Ui)  r(x“_l  n,i  -  1  I  0>i)]  (3.11) 

Xfc,n 

Maximizing  this  quantity  rather  than  the  conditional  probability  is  desirable  because 
we  avoid  having  to  compute  values  for  all  X*  n  E  X“k ,  which  we  would  have  to  do  to  find 
the  denominator  term  as  in  Eqn.  (3.6).  Examining  I*(xkn,k  j  u>t)  closely,  note  that  the 
preceding  equation  is  equivalent  to: 

MAX  F(xkn,k  |  Wi)  =  MAX  [p(X“„, Z[  |  Zdm^)\  (3.12) 

va  vo 

X*,n  ^k,n 

Thus,  the  practical  version  of  the  Larson  and  Peschon  equations  gives  the  maximum 
value  of  p(X£  n,Z{  |  ZJ^Wj),  and  the  state  history  estimate  X££,;  for  a  given  u>j  which 
gives  that  maximum  joint  conditional  probability  density,  or,  equivalently,  since  there  is 
only  one  Zk,  that  state  history  which  maximizes  p(Xkn  |  Zk,  Z^,o>,). 

Suppose  on  the  other  hand  that  we  had  chosen  to  find  p(Xkn,Zk,u>i  |  ZjJ,)  for  all 
X*  n  E  X*k.  This  can  be  had  by  computing  p(X*n,  Zk  \  Z^,u ><)  for  each  X?  n  over  each  u/j 


3-27 


and  multiplying  each  such  quantity  by  the  appropriate  p(o>,)  (again,  as  above,  taken  equal 
to  p(u>i  |  Z^),  with  the  same  qualifications)  to  find: 


P(X£„, Z{  I  Zdm,u>i)p(u,i)  =  p(Xln)Z{^i  I  K) 


But: 


N, 


p(u>i\Z{,Zdm)=  Y  piXt^uii  \Z{,Zdm) 


(3.13) 


(3.14) 


or: 


N. 


p(ui\Z{,Zim)=  Y 


,n=l 

r»  1 


p(Xln)Z{,Vi\ZdJ 


P(Z{  I K) 


(3.15) 


where: 

p(Z{\Zdm)=  Y  E  P(Xfc,n>  Zjfe.Wj  |  ZjJ,)  (3.16) 

"i,y= 1  XJ  n,n  =  2 

or: 


EJ 


p(X2,n,z{,^|z-) 

,n=lJKxjfm,z{,Wj|zi) 


(3.17) 


which  is  equivalent  to  the  expression  in  Eqn.  (3.9),  requiring  consideration  of  probabilities 
for  all  paths  X£  „,  over  all  object  models  w,  -  an  exhaustive  search. 

In  identifying  the  fact  that  p(u>t  |  Z[,  ZjJJ  can  be  found  properly  only  by  exhaustive 
search,  we  have  reached  the  fundamental  objective  of  this  Bayesian  development.  The  as¬ 
sumptions  that  were  made  in  pursuing  this  objective  can  be  relaxed  without  changing  this 
fundamental  result,  by  implementing  the  suggestions  made  above  where  those  assumptions 
were  introduced,  and  repeating  the  development.  Summarizing  key  issues,  those  relaxed 
assumptions  could  include:  (1)  allowing  for  an  infinite  number  of  aspect  angle  locations  (a 
continuous  space);  (2)  allowing  for  the  generation  of  a  single  feature  observable  measure- 
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ment  from  a  set  of  aspect  angle  locations,  rather  than  one;  (3)  allowing  for  inclusion  of  a 
priori  information  on  maneuver  likelihoods  for  different  classes,  and  (4)  obtaining  maxi¬ 
mum  a  posteriori  probabilities,  rather  than  maximum  likelihoods,  for  signature-to-aspect 
angle  associations  and  other  events  where  uncertainty  exists  in  the  origin  conditions  for 
the  observed  signature  (due  to  uncertain  range  registration,  for  example). 

3.6.4  Approximations  for  p(uJi  |  Z[ ,  Z^),  and  Associated  Problems.  In  any 
case,  it  is  clear  that  the  a  posteriori  probability  p(u>j  |  Z{,Z jjj  will  not  be  practical  to 
obtain  -  attempting  to  keep  track  of  conditional  probabilities  for  all  tracks  X£  n  over  all 
object  models  u>i  would  be  in  general  a  monumental  task,  to  be  avoid  .<  t  all  possible. 
Modifying  the  right  side  of  Eqn.  (3.17)  to  take  summations  over  prescribed  sets  of  X£  n’s 
rather  than  all  (e.g.,  proper  subsets  of  X£k)  for  each  in  Eqn.  (3.17)  creates  a  limiting 
process,  so  that  as  we  converge  toward  summations  over  all  X£  n  6  X“k  for  each  wt,  the 
modified  term  converges  toward  the  desired  probability  p(a/j  |  Z{,  Z^). 

We  recognize  that  most  of  the  paths  will  contribute  little  to  the  final  probability 
calculations.  Clearly,  the  path  which  contributes  the  most  to  the  probability  calculation 
for  each  w*  is  the  one  given  by  the  Larson  and  Peschon  equations,  X££,..  Therefore,  if 
we  make  the  (extreme)  choice  of  approximating  the  desired  p(w<  |  Z[ ,  Z?n)  using  only  one 
path  X“  n  for  each  u>it  the  most  reasonable  such  approximation  would  be  given  by  (note 
the  “hat”  over  p,  denoting  an  estimate): 


p(*i  |z{X)  = 


P(x^,zi^\K) 


Ei,>=iP(X^.,Z{,u;i  |  Z£) 
where,  using  the  appropriate  7*(x£„,fc  j  a/*)  for  each  a 


(3.18) 


P(X*/«.,Z{,«<  |Zi)  = 

We  assume  that  the  aspect  angle  spaces  for  the  various  object  models  are  adequately 
discretized,  or  sampled  in  a  Shannon  sampling  sense,  so  that  variations  in  feature  observ¬ 
ables  from  one  discrete  aspect  angle  state  to  the  next  are  “small”  in  some  sense  for 


MAX  7*(x“  ,*!<*) 


p(<*>.) 


(3.19) 
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example,  an  expected  feature  observable  value  from  a  (continuous)  aspect  angle  value  be¬ 
tween  two  adjacent  discrete  aspect  angle  values  should  be  well  approximated  as  a  linear 
interpolation  of  the  feature  observable  values  for  the  two  adjacent  discrete  states. 

Then,  progressively  better  approximations  to  p(u>j  |  Z{,  Z£J  would  be  given  by  sum¬ 
ming  also  the  contributions  from  paths  for  each  that  pass  through  points  X£  n  in  pro¬ 
gressively  larger  neighborhoods  around  the  points  in  X^.  for  each  u/j.  An  alternative  and 
perhaps  better  approach  would  be  to  consider  some  number  of  additional  paths  XjJ  (for 
fair  comparison,  the  same  number  of  additional  paths  for  each  class  u^)  tracing  back  from 
terminal  points  x£  n  that  resulted  in  values  of  /*(x£  n,  k  |  a;*)  closest  to  the  maximum  value 
which  defined  Xj^..  A  third  alternative  would  be  consider  contributions  from  aspect  angle 
states  that  lie  in  between  the  elements  of  X££,.,  but  which  are  skipped  along  that  path, 
or  not  included  in  Xj^.  per  se.  Consideration  of  this  third  alternative  will  lead  back  to 
classical  dynamic  programming-based  sequence  comparison  methods  in  Sect.  3.7. 

In  summary,  we  have  obtained  in  this  section  an  expression  for  the  probability  of 
object  class  membership  via  Bayesian  techniques,  where  the  new  information  (in  a  Bayesian 
sense)  is  given  by  feature  space  measurements,  conditioned  on  the  likelihood  of  required 
kinematic  transitions  in  the  underlying  aspect  angle  state  space.  This  conditioning  provides 
the  restriction  of  feature  space  matches  to  kinematically  likely  subsets,  as  discussed  in 
Sect.  2.6.  Due  to  the  assumed  highly  nonlinear  and  complex  nature  of  the  relationship 
between  transitions  in  aspect  angle  and  feature  space,  we  used  a  Larson  and  Peschon 
estimation  approach,  which  in  general  forces  us  to  approximate  by  using  only  a  finite 
number  (perhaps  one)  of  likely  aspect  angle  state  sets  (sequences)  for  each  object  class. 

3.6.5  Obtaining  “A  Priori”  Information  for  the  L&P  Approach.  A  significant 
question  left  unanswered  in  the  previous  section  is  how  to  obtain  the  a  priori  knowledge  of 
aspect  angle  transitions,  expressed  as  the  aspect  angle  state  (cell)  transition  probability  or 
likelihood  p(x£+l  n  |  xjj  n,  Z^, u>j)  and  probability  of  starting  cell  location  p(x“  „  |  Z^o;*). 
In  object  recognition,  this  is  simply  information  that  we  have  about  the  object-sensor  an¬ 
gular  position  and  rate,  apart  from  what  we  may  learn  from  the  latest  feature  observable 
measurement.  If  our  only  other  source  of  object  information  is  a  conventional  “kinematic” 
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object  tracker  using  prior  and  current  position,  (pointing)  angle,  and  range  rate  measure¬ 
ments  against  highly  dynamic  objects  like  aircraft  capable  of  sudden  turns,  we  may  rightly 
feel  that  information  on  angular  position  and  rate  from  this  source  is  of  little  value.  At  least 
two  reasonable  cases  for  defining  p(x“+1  n  |  x“n,  Z^,^)  from  kinematics  exist,  however. 

First,  if  we  are  willing  to  accept  non-real  time  estimates,  we  can  smooth  our  kine¬ 
matic  measurements  to  provide  a  high-quality  estimate  of  object  velocity  and  acceleration. 
If  the  object  is  an  aircraft  determined  to  be  in  a  turn,  with  reasonably  constant  acceler¬ 
ation  relative  to  the  target  body  frame  for  a  period  of  several  seconds,  a  smoother  can 
be  made  to  yield  an  estimate  of  target  state  sufficiently  adequate  so  as  to  allow  a  reason¬ 
able  determination  of  aspect  angle  from  kinematic  (smoother)  information  alone.  A  L&P 
estimator  using  this  concept  is  discussed  in  Chapter  V. 

Second,  recall  that  the  Kendrick-type  filter  provides  a  high-quality  velocity  and  ac¬ 
celeration  estimate  for  aircraft  targets  by  considering  the  target  aerodynamic  state  im¬ 
plied  by  aspect  angle  measurements.  This  implies  a  reasonable  real-time  estimate  for 
P(xj+i,n  I  Xj,m  Z^,wt)  from  a  Kendrick-type  filter.  Conversely,  however,  the  Kendrick- 
type  filter  requires  a  reasonable  aspect  angle  estimate,  which  may  not  be  available  from 
conventional  pose  estimators  in  some  feature  spaces.  Such  an  estimate  is,  however,  avail¬ 
able  from  the  L&P-type  estimator,  as  the  latest  state  xj*  in  the  sequence  ..  We  will 
return  to  this  thought  in  Sect.  3.9. 

Note  that  each  of  these  two  cases  prescribes  only  a  minimum- sufficient  set  of  con¬ 
ditions  for  providing  reasonable  a  priori  aspect  angle  transition  information  on  aircraft 
targets  to  a  Larson  and  Peschon-type  algorithm.  In  the  first  case,  we  are  able  to  use  a 
simple  tracking  filter  and  kinematic  measurements  only,  but  surrender  real-time  operation 
by  the  need  to  smooth.  In  the  second  case,  we  accept  a  more  complicated  tracking  filter, 
but  recoup  the  possibility  of  real  time  information.  If,  on  the  other  hand,  we  desire  to  have 
the  best  possible  information  quality  regarding  aspect  angle  transition,  the  proper  answer 
is  perhaps  some  form  of  integrated  kinematic/aspect-angle  filter  and  a  smoother.  Simply, 
we  should  always  be  able  to  estimate  better  with  more  information  than  we  can  with  less. 
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In  any  case,  however,  we  see  that  the  time  period  over  which  kinematic  measurements 
Z^,  are  taken  must  encompass  that  over  which  feature  observable  measurements  Z{  are 
taken.  In  the  first  case,  the  smoothing  process  will  require  that  kinematic  measurements 
precede  and  follow  the  time  interval  over  which  feature  observables  are  measured.  In 
the  second  case,  it  will  be  desirable  that  initial  kinematic  measurements  precede  feature 
observable  measurements,  so  that  target  velocity  information  is  available  for  use  with  initial 
leature  observable  measurements.  This  latter  case  was  true  in  Sect.  3.4.2,  for  which  we 
effectively  had  m  =  k  +  s  with  s  >  1. 

A  final  caution  is  in  order  regarding  the  estimation  of  aspect  angle  states  or  tran¬ 
sitions  from  translational  kinematics.  The  quality  of  these  estimates  is  driven  by  (1)  the 
extent  to  aspect  angle  and  kinematics  are  linked  for  each  object  class,  and  (2)  the  extent 
to  which  we  properly  understand  that  linkage  for  each  possible  object  class.  For  ground 
vehicles  for  example,  a  fundamental  problem  may  arise  from  assumptions  that  a  vehicle  is 
turning  on  a  flat  surface.  For  helicopter  targets,  aspect  angle  and  translational  kinematics 
are  very  poorly  linked  in  some  flight  regimes  -  in  such  cases,  we  may  be  able  to  say  only 
that  aspect  angle  change  with  time  is  bounded,  although  with  unpredictable  direction. 

With  respect  to  aircraft,  the  reader  familiar  with  developments  in  control- configured 
flight  will  recognize  that  many  of  our  concerns  about  the  ability  to  predict  aspect  angle 
from  kinematic  measurements  arise  from  the  new  possibilities  presented  by  these  devel¬ 
opments,  relative  to  conventional  “coordinated  turn”  flight.  An  interesting  discussion  of 
these  possibilities  and  underlying  tactical  requirements  is  found  in  [112].  Some  of  the  new 
capabilities  include  high  angle  of  attack  flight  for  maneuverability,  pitch  pointing  for  gun 
aiming  without  change  of  flight  path  angle,  zero  angle  of  attack  turns  that  keep  the  wind 
vector  aligned  with  the  body  Xf,  or  longitudinal  axis,  and  so  on. 

All  of  these  possibilities  complicate  the  problem,  but  do  not  make  recognition  impos¬ 
sible  by  any  means.  If  target  behavior  hypotheses  in  addition  to  the  coordinated  turn  model 
are  feasible  and  expected  for  one  or  more  target  classes,  the  recognizer  can  be  designed  to 
consider  them.  This  increases  the  dimensionality  of  our  problem,  which  is  already  natu¬ 
rally  expressed  as  a  multiple  model  (object  or  target  class)  estimator  -  the  fundamental 
approach  discussed  in  Sect.  2. 3. 1.3,  whether  or  not  the  models  are  implemented  as  Kalman 
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filters.  However,  as  long  as  at  least  one  of  our  hypotheses  is  close  in  some  sense  to  the 
actual  target  behavior,  correct  recognition  remains  feasible.  In  fact,  the  computation  of 
aspect  angle  estimates  from  translational  kinematics  is  simpler  for  some  of  these  options 
(e.g.,  the  zero  angle  of  attack  turns  noted  above)  than  for  the  coordinated  turn  hypothesis 
assumed  for  aircraft  targets  here.  Thus,  implementation  of  models  for  these  hypotheses  in 
a  multiple  model  recognizer  or  estimator  will  be  more  straightforward  than  implementa¬ 
tion  of  the  model  for  the  fundamental  coordinated  turn  hypothesis.  In  nearly  every  case, 
however,  target  kinematics  provides  some  information  about  aspect  angle  -  our  intent  is 
to  use  that  information. 

3.6.6  Issues  in  the  Use  of  Larson  and  Peschon  Methods  in  Object  Recognition.  As 
implied  in  Sect.  2.4.5  of  the  last  chapter,  there  is  a  potential  pitfall  of  a  “basic”  Larson  and 
Peschon  approach  as  applied  to  object  recognition  (e.g.,  Eqn.  (3.18))  -  making  decisions 
based  on  but  one  set  of  k  aspect  angle  states  or  one  path  per  object  model.  In  fact,  assuming 
equal  a  priori  probabilities  for  each  the  path  which  yields  the  highest  /*(x£n,  k  j  u/t) 
over  all  may  not  fall  on  the  particular  u u  which  has  the  highest  actual  a  posteriori 
probability  p(w,  |  Z{,  Z^)  of  being  the  correct  class.  It  may  be  that  one  model  has  a 
particularly  “well-configured”  set  of  aspect  angle  states  and  associated  feature  observable 
values,  such  that  the  “best”  path  X££,.  on  this  model  traverses  these  states,  and  give  this 
model  the  highest  p(X*  „,  Z{  |  Z^,u><),  but  if  all  possible  X£  „  are  considered,  this  model 
is  less  likely  to  have  been  the  origin  of  the  observed  Z *  than  some  other  model. 

An  example  is  helpful  to  illustrate  this  issue  (see  Fig.  3.5).  Suppose  we  have  two 
contiguous  sequences  of  three  aspect  angle  “pixels”  each.  Presume  that  we  know  that  the 
object  presented  one  of  the  two  sequences  to  the  sensor  at  some  imprecisely  known  rate 
over  some  interval  of  time,  including  the  time  tm  at  which  a  measurement  z  was  received. 
The  probability  density  or  likelihood  that  p(z  |  x)  that  state  x  would  yield  the  (scalar) 
measurement  z  is  taken  to  be  Gaussian,  with  mean  and  variance  shown  for  each  pixel.  Now 
suppose  the  measured  value  is  “z  =  3.05”.  If  required  to  choose  between  the  two  paths,  a 
basic  L&P-type  algorithm  would  choose  path  one,  since  this  path  contains  one  pixel,  P12, 
most  likely  to  have  produced  the  given  observation. 
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Path  One 


p(z  |  x  =  Pu):  Gaussian  with  mean  6.0,  standard  deviation  1.0 

p(z  |  x  —  P12):  Gaussian  with  mean  3.0,  standard  deviation  1.0 

p(z  |  x  =  Pi3):  Gaussian  with  mean  6.0,  standard  deviation  1.0 


- O 

Path  Two 

- O 

P2 1 

P22 

P23 

- O 

p(z  |  x  —  P2i):  Gaussian  with  mean  2.95,  standard  deviation  1.0 
p(z  |  x  =  P22):  Gaussian  with  mean  2.95,  standard  deviation  1.0 
p(z  |  x  —  P23):  Gaussian  with  mean  2.95,  standard  deviation  1.0 

Figure  3.5.  Two  Aspect  Angle  Paths 

But,  given  our  imprecise  information  on  the  transition  of  the  sensor  over  these  pixels, 
it  is  perhaps  almost  equally  likely  that  the  first  or  third  cells  could  have  generated  the 
received  measurement.  Which  path  is  truly  more  likely  to  have  generated  the  observation? 
Using  the  full  Bayesian  approach  as  in  Eqn.  (3.9)  or  (3.17),  we  would  select  path  two. 

Now  observe  that  a  classical  sequence  comparison  or  “dynamic  time  warping”  concept 
with  simple  continuity  rules,  unable  to  delete  pixels  and  “skip”  pixels  Pn  and  P13  ,  would 
have  picked  path  two,  since,  by  any  reasonable  metric,  this  path  is  more  likely  over  its 
extent  to  have  produced  the  observed  z. 

This  example  illustrates  a  potential  weakness  in  L&P-based  concepts  for  object  recog¬ 
nition  by  aspect  angle  path  determination,  and  indeed,  a  caution  in  the  use  of  the  Larson 
and  Peschon  method.  Recall  that  in  the  Barniv  and  Kramer  applications  of  the  Larson 
and  Peschon  method,  we  had  visibility  over  the  individual  states,  i.e.,  cells  or  sequences 
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of  cells  overlying  some  physical  space.  The  origin  state  of  a  given  measurement  was  never 
ambiguous  -  the  ambiguity  lay  in  whether  or  not  a  given  string  of  measurements  over  some 
states  represented  a  real  target.  This  is  not  so  in  object  recognition  when  we  take,  as  states, 
“cells”  of  aspect  angle  on  some  hypothetical  aspect  angle  sphere  -  here  we  never  really 
know  which  aspect  angle  state  (cell)  generated  which  measurement  -  only  that  a  particular 
measurement  was  generated  as  a  certain  set  of  states  was  passed  through  or  over. 

Thus,  for  the  object  recognition  application,  when  the  number  of  cells  or  aspect  angle 
states  outnumber  the  number  of  measurements,  we  are  perhaps  better  advised  to  ask  the 
question,  “ given  that  we  have  traversed  a  continuous  path  over  one  of  several  hypothetical 
aspect  angle  spheres,  which  sphere  is  more  likely  to  have  yielded  the  observed  sequence  of 
discrete  observations ?”  The  issues  here  are  precisely  those  discussed  in  relation  to  Fig.  3.1: 
what  is  the  relationship  between  the  feature  observables,  state  space,  and  likely  transitions 
on  that  state  space  as  measurements  occur? 

One  approach  to  working  this  problem  is  to  consider  contributions  from  multiple 
trajectories  as  discussed  in  Sect.  3.6.4.  An  alternative  approach  to  using  the  L&P  format 
would  be  classical  sequence  comparison  or  dynamic  time  warping-based  methods.  First, 
one  would  construct  sets  of  trajectories  through  the  state  space  over  the  time  frame  of 
interest,  using  the  same  information  on  aspect  angle  from  kinematics  used  to  provide 
p(x“„  |  Z^jWj)  and  p(x_,+li„  |  xJ  n,  Z^,u/<),  the  a  priori  information  for  L&P- type  ap¬ 
proaches.  These  trajectories  then  imply  sequences  of  feature  observations,  which  can  be 
compared  to  the  observed  sequences  using  classical  sequence  comparison  or  DTW-like  tech¬ 
niques.  Further,  in  a  departure  from  usual  DTW,  we  can  allow  the  “best  path”  to  move 
from  one  trajectory  to  another.  This  defines  a  “two-dimensional”  form  of  classical  sequence 
comparison,  as  in  Fig.  2.9. 

Users  of  Larson  and  Peschon-like  approaches  for  object  recognition  such  as  those  to 
be  demonstrated  in  Chapter  V  must  keep  these  issues  in  mind.  Certainly,  one  may  conduct 
studies  in  a  straightforward  Monte-Carlo  manner  to  determine  the  probabilities  of  false  or 
correct  class  assignment  when  using  an  approximate  approach  like  that  in  Eqn.  (3.18), 
as  compared  to  a  more  probabilistically  correct  approach  like  that  given  in  Eqn.  (3.17). 
It  should  also  be  clear  that  all  such  expressions  for  p(wj  |  Z{,Z jJJ  are  in  fact  generalized 
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likelihood  functions  and,  as  such,  Eire  suitable  for  ansdysis  using  the  techniques  discussed 
in  Sect.  2.7,  to  be  further  discussed  in  Sect.  3.11. 

3. 7  Classical  Sequence  Comparison  in  Object  Recognition. 

In  the  last  several  sections,  we  developed  the  L  sir  son  and  Peschon  equations  for  ob¬ 
ject  recognition.  While  reviewing  some  potential  problems  with  Larson  and  Peschon-like 
approaches  to  sequence  compEirison,  we  observed  that  classicEil  sequence  comparison  might 
serve  to  overcome  those  problems.  In  this  section,  we  will  construct  object  recognition  as 
a  problem  in  classicEil  sequence  comparison,  and  consider  some  problems  inherent  in  this 
approach.  Ultimately,  the  lesson  is  that  the  Larson  and  Peschon  equations  and  classicad 
sequence  comparison  are  intimately  related,  Eind  should  be  considered  as  “tools”  for  solv¬ 
ing  sequence  comparison  problems  (as  indeed  may  other  Eilgorithins,  like  hidden  Markov 
models  [176],  etc.)  -  particular  tools  may  serve  particular  scenarios  better  thEin  others. 

All  of  these  sequence  comparison  or  “motion  WEirping”  tools  will  serve  to  a  greater 
or  lesser  degree  to  achieve  our  goEd  -  restricting  the  matching  domEun  of  object  recogni¬ 
tion  algorithms  to  kinematically  likely  sets.  Thus,  the  “likelihood  function”  outputs  of 
these  algorithms  will  reflect,  to  a  greater  degree  them  for  conventional  “independent  look” 
algorithms,  the  joint  likelihood  of  observed  features  (signatures)  and  kinematics. 

3.7.1  Motivating  Classical  Sequence  Comparison  in  Object  Recognition.  In 
Sect.  3.6,  we  constructed  object  recognition  as  a  problem  in  LEtrson  Eind  Peschon  estima¬ 
tion.  In  this  section,  we  will  view  the  same  problem  from  a  classical  sequence  compEirison 
perspective.  Figure  3.6  illustrates  a  baseline  case  of  sequence  compEirison  for  multiple 
models.  The  figure  shows  three  object  models,  or  hypothetical  aspect  angle  spheres,  for 
which  discrete  signatures  are  recorded  a  priori.  A  sequence  of  true  signatures  or  feature 
observables  is  extracted  from  the  topmost  or  true  object,  which  is  of  course  not  identified  a 
priori  to  the  recognizer.  Also,  derived  generally  in  the  manner  by  which  the  Kendrick  esti¬ 
mator  in  Sect.  2.3.3  found  the  kinematically-implied  aspect  angle  “pseudo-meEisurement,” 
we  have  a  kinematically-implied  aspect  path  for  each  object  model. 
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Fig.  3. 6. a 
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Restating  the  problem  using  the  variable  conventions  of  Sect.  3.6,  we  have  a  sequence 
of  measured  feature  observable  values  z *  extracted  over  time  to  form  the  measurement 
history  Z} .  Each  element  in  the  history  corresponds  to  an  element  in  the  sequence,  i.e., 
an  observed  value,  function  (e.g.,  a  radar  range  sweep  measurement)  or  again  a  vector 
of  values  at  a  particular  measurement  time.  The  kinematically-implied  paths  or  aspect 
angle  state  sequences  are  the  same  ones  that  would  be  found  by  propagating  maximum 
likelihood  values  of  the  Larson  and  Peschon  a  priori  information  p(x“+1  n  |  x“n,Z£,,u?i) 
and  probability  of  starting  cell  location  p(xgn  |  ,  w, )  for  each  object  model  throughout 
the  entire  time  interval  of  interest. 

Along  that  “kinematic  path”,  from  each  model  j  of  J  object  models,  we  can  extract 
a  sequence  (vector)  Zf  =  ?f[x“(t)]  of  estimated  or  predicted  feature  observable  values 
zj  =  h[x°(tj)]  for  times  U  corresponding  to  the  times  at  which  the  elements  of  Zf  were 
observed.  The  labeling  convention  used  here  is  as  in  the  discussion  on  tracking  filters 
previously,  but  the  calligraphic  letter  “H”  or  H  distinguishes  the  sequence  of  predicted 
measurements  from  the  measurement  matrix  H  discussed  in  Sect.  2.3  in  connection  with 
the  Kalman  filter. 

Note  that  this  procedure  will  make  no  attempt  to  use  individual  measured  fea¬ 
ture  observable  values  to  determine  “pose  estimates”  for  a  maximum  likelihood  “feature 
observable-based  aspect  path”  on  each  model,  as  in  Fig.  2.6.  Thus,  the  models  show  no 
such  path. 

If  (1)  the  model  class  u>j  corresponds  exactly  to  the  object  class,  (2)  the  object  is 
moving  with  the  kinematics  assumed  for  the  model,  (3)  the  object  signature  is  non-random, 
and  (4)  if  our  atmospheric  transmission  and  sensors  Eire  noise-free,  then  Z!  and  W[x°(t)] 
will  be  equal.  This  situation  is  illustrated  in  Fig.  3.7,  in  which  xa(i*)  represents  the  fcth 
point  in  the  sequence  of  kinematically-estimated  aspect  angles,  while  xa(tk)  represents  the 
Jbth  point  in  the  sequence  of  true  aspect  angles.  Note  for  future  consideration  that  here 
the  kinematically-estimated  and  true  paths  trace  identical  aspect  angles  at  identical  times. 

In  general,  however,  these  conditions  will  not  be  satisfied:  (1)  there  will  be  only 
one  observed  sequence  Z} ,  but  there  will  be  one  anticipated  sequence  7 f[x°(f)]  for  each 
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Xa(f;)  =  Xa(ti) 

for  all  i,  i  =  1, 5 


Figure  3.7.  Complete  Correspondence  Between  Kinematically  Estimated  and  True  As¬ 
pect  Angle  Paths 

possible  object  (model)  class,  as  we  saw  in  Fig.  3.6  (the  origin  object  class  of  Z f  is,  of 
course,  not  known  a  priori );  (2)  the  object  kinematics  will  not  be  perfectly  known;  (3)  the 
object  signature  generation  is  a  random  process,  even  for  constant  aspect  angle,  and  (4) 
atmospheric  transmission  and  sensors  add  further  random  noise  to  the  signature. 

Note  in  Fig.  3.6  that  the  kinematic  aspect  path  and  the  true  path  can  properly  be 
shown  together  only  on  one  model  -  the  true  (unknown  a  priori)  model.  This  is  because 
the  true  path  is  generated  by  only  one  model  (class),  not  all.  If  the  true  and  kinematic 
aspect  paths  for  this  true  model  lie  on  the  same  aspect  angle  path,  but  traverse  discrete 
signature  origin  points  along  that  path  at  different  times,  then,  in  the  absence  of  other 
random  factors,  the  measured  feature  observable  values  and  the  kinematically-estimated 
feature  observable  values  for  the  correct  object  model  will  be  equal  except  for  sections 
of  relative  compression  and  expansion.  These  differences  between  the  resulting  sequences 
can  be  resolved  with  the  usual  “one- dimensional”  classical  dynamic  programming  sequence 
comparison,  or  “warping”  process  discussed  in  Sect.  2.4.2. 

The  warping  process  takes  place  between  the  elements  of  Zf  and  each  H[x.a{t)}  - 
the  pairing  of  Zf  and  some  7f[x°(t)]  for  which  the  dynamic  programming-based  sequence 
comparison  cost  is  least  is  taken  to  indicate  the  correct  class  u>i  for  the  object  which  yielded 
Z} .  We  may  note  also  that,  in  general,  the  two  sequences  to  be  compared  need  not  have  an 
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equal  number  of  elements.  Dynamic  programming-based  sequence  comparison  techniques 
handle  small  mismatches  in  the  total  number  of  elements  naturally  by  “expanding”  the 
shorter  sequence  as  required  to  achieve  a  best  comparison. 

Where  the  true  path  and  kinematic  path  lie  side  by  side,  or  cross,  we  will  call  this  a 
condition  of  “off-nominal  path”  errors,  where  the  nominal  path  is  in  fact  the  kinematically- 
estimated  path.  Off-nominal  path  errors  are  expected  to  be  due  to  errors  in  modeling  or 
estimation  of  aspect  angle  from  kinematics,  as  discussed  in  Sect.  3.6.5.  Consider  Fig.  3. 6. a 
as  representing  a  case  of  a  nominal  aspect  angle  path  and  the  true  path  separated  by  an 
off-nominal  error.  Off-nominal  path  errors  cannot  be  accounted  for  by  “one-dimensional” 
classical  sequence  comparison  or  “warping” ,  but  if  the  off-nominal  path  error  is  reasonably 
small  and  the  feature  observables  are  reasonably  constant  in  directions  normal  to  the  two 
paths,  it  may  still  be  possible  in  general  to  remove  some  portion  of  the  errors  due  to 
compression  and  expansion,  and  perhaps  to  identify  the  best  object-model  match  over  the 
whole  trajectory. 

It  should  be  clear  that  most  of  the  illustrations  so  far  in  this  chapter  have  shown  the 
general  case  in  which  we  have  off-nominal  path  errors.  If  we  attempt  only  a  single  “one¬ 
dimensional”  sequence  comparison  for  each  model,  it  is  possible  to  conceive  of  situations 
in  which  off-nominal  path  errors  would  lead  to  misclassification. 

As  implied  in  the  discussion  of  Sects.  2.4.3  and  2.4.5,  however,  extensions  to  clas¬ 
sical  sequence  comparison  can  deal  directly  with  off-nominal  path  errors.  Rather  than 
comparing  our  observed  sequence  to  a  single  one- dimensional  sequence  from  each  object 
model,  we  may  compare  our  observed  sequence  to  a  number  of  sequences  taken  from  a  two 
dimensional  aspect  angle  region  on  each  object  model,  as  shown  in  Fig.  2.9. 

3.7.2  Implementing  Classical  Sequence  Comparison  in  Object  Recognition.  The 
implementation  of  one  and  two-dimensional  sequence  comparison  techniques  on  a  multiple- 
window  aspect  angle  region  is  illustrated  in  Fig.  3.8.  This  figure  uses  the  visual  format 
common  to  our  earlier  discussions. 

The  dotted  line  labeled  “1-D  Track”  in  Fig.  3.8  shows  a  “one-dimensional”  set  of 
aspect  angle  states  against  which  classical  sequence  comparison  is  attempted.  It  is  very 
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Window  for  z; 


Track 


important  to  note  the  relationship  between  Fig.  2.8  and  Fig.  3.8  -  the  first  figure  is  func¬ 
tionally  identical  to  what  we  would  see  by  viewing  Fig.  3.8  from  the  right  side,  for  any  one 
path.  A  “window”  in  Fig.  3.8  is  directly  analogous  to  a  horizonal  layer  of  circles  between 
the  dotted  lines  in  Fig.  2.8. 

The  continuity  constraints  for  motion  warping  by  one  dimensional  classical  sequence 
comparison  are  again  directly  related  to  those  for  dynamic  time  warping  and  similar  con¬ 
cepts.  The  global  path  bounds  (dotted  lines  in  Fig.  2.8)  are  defined  by  the  individual 
window  extents  shown  in  Fig.  3.8.  The  1-D  local  continuity  constraints  in  Fig.  3.8  are 
exactly  those  of  Fig.  2.8,  except  that  vertical  transitions  have  been  forbidden,  for  rea¬ 
sons  to  be  discussed  below.  Thus,  relative  to  some  “current”  point  for  which  costs  are  to 
be  computed,  the  predecessor  points  may  lie  to  the  rear  on  the  same  window  (horizontal 
transition),  or  one  step  to  the  rear  and  down  on  the  previous  window  (diagonal  transition) . 

Note  that  a  horizontal  transition  corresponds  to  associating  the  same  measurement 
with  more  than  one  element  in  the  aspect  angle  region,  wliile  a  diagonal  transition  corre¬ 
sponds  to  a  transition  from  associations  for  one  measurement  to  associations  for  the  next 
measurement.  The  2-D  local  continuity  constraints  in  Fig.  3.8  are  a  simple  extension  of 
the  1-D  local  continuity  constraints,  allowing  for  a  predecessor  point  to  be  located  to  the 
right  or  left  of  the  current  point  -  that  is,  on  a  neighboring  track  parallel  to  the  direction 
of  expected  aspect  angle  change. 

The  only  additional  continuity  constraints  are  ones  that  provide  for  minimum  and 
maximum  numbers  of  horizontal  associations  or  transitions  on  each  window  -  that  is,  a 
given  measurement  or  observation  must  be  associated  with  a  minimum  number  of  aspect 
angle  cells,  but  cannot  be  associated  with  more  than  some  maximum  number  of  cells.  More 
complicated  continuity  rules  can  limit  multiple  consecutive  diagonal  transitions,  or  allow 
skipping  of  aspect  angle  cells  (“deletions”  from  the  object  signature  map,  in  the  sense  used 
by  [195]),  and  so  on. 

Vertical  transitions  are  forbidden  in  our  case  since  a  vertical  transition  implies  that 
no  aspect  angle  transition  took  place  between  two  measurements  -  two  successive  mea¬ 
surements  are  associated  with  the  same  aspect  angle  location.  This  will  be  practically 
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impossible  for  the  scenarios  in  which  this  algorithm  will  be  applied,  and  is  therefore  ex¬ 
cluded  from  consideration  here.  In  some  scenarios  (e.g.,  stationary  targets),  it  may  be 
necessary  to  provide  for  vertical  transitions  in  the  continuity  constraints.  Also,  for  cases  in 
which  there  are  more  aspect  angle  cells  in  the  direction  of  likely  transitions  than  there  are 
measurements  to  be  associated,  elimination  of  vertical  transitions  helps  to  prevent  undesir¬ 
ably  short  association  paths  in  the  warping  path  space  (associating  the  measurements  with 
too  few  aspect  angle  locations).  We  will  return  to  the  subject  of  path  length  compensation 
in  Chapter  V. 

Resilience  to  off-nominal  path  errors  was  of  course  a  strong  point  for  Larson  and 
Peschon-based  approaches  defined  in  Sect.  3.6.  These  approaches  naturally  worked  on 
two-dimensional  aspect  angle  regions,  and  allowed  one  to  quantify  the  effect  of  off-nominal 
path  errors,  in  a  Bayesian  probabilistic  structure. 

The  differences  between  sequences  expected  from  kinematic  measurements  and  ob¬ 
served  sequences  may  be  more  complicated  than  the  simple  off-nominal  errors  shown  above. 
Fig.  3.9  illustrates  a  case  in  which  the  true  aspect  angle  path  actually  doubles  back  on 
itself  for  a  short  period.  Here,  this  “wobble”  is  not  picked  up  by  the  kinematic  sensors  and 
thus  is  not  reflected  in  the  kinematically-derived  aspect  path. 

Note  that  if  a  linear  (non-warped)  sequence  comparison  were  made  between  the 
kinematic  and  measured  sequences  (assume  negligible  sensor  noise  on  the  feature  observable 
measurements)  the  effective  aspect  angle  pairings  would  be  as  shown  in  the  first  point-to- 
point  association  below  Fig.  3.9.  With  dynamic  programming-based  sequence  matching 
techniques,  however,  we  should  be  able  to  determine  that  a  more  reasonable  match  is  as 
shown  in  the  second  point-to-point  association  below  Fig.  3.9. 

Dealing  with  cases  like  the  one  shown  here  will  require  us  to  be  careful  in  designing 
those  continuity  constraints  (see  Sect.  2.4.2)  employed  to  restrict  the  dimensionality  of 
dynamic  programming  decisions.  These  may  prevent  us  from  making  certain  optimal 
point-to-point  assignments.  In  the  case  in  Fig.  3.9,  for  example,  we  would  probably  desire 
to  associate  point  x!|  with  point  x!|,  but  since  point  x|J  is  associated  with  point  xj,  the 
former,  desired  association  will  be  prohibited  in  classical  sequence  comparison  -  this  would 
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Kinematic  Aspect  Angle  to  True  Aspect  Angle 
Association  Without  Warping: 
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1  1  1 
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xa(<4) 

x^e) 

Match  of  x°(t3)  to  xa(t5)  not  permitted  here. 

Figure  3.9.  Kinematic-to-True  Aspect  Angle  Association  Improvement  With  Warping 
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amount  to  moving  backward  in  the  dynamic  programming  assignment  matrix.  “Backward” 
path  options  are  generally  forbidden  in  classical  sequence  comparison  to  limit  the  Curse 
of  Dimensionality. 

These  classical  sequence  comparison  continuity  constraints,  or  p(x*+1  |  xt)  in  the 
Larson  and  Peschon  formulation,  or  other  restrictions  on  aspect  single  transitions,  can  be 
tuned  to  allow  the  sequence  comparison  process  to  include  such  folding  or  other  errors  in 
the  estimate  of  aspect  angle  transition  from  kinematics.  Any  loosening  of  restrictions  must 
be  done  carefully,  however,  because  we  want  the  restriction  of  matching  paths  to  reasonable 
aspect  angle  progressions  to  be  the  significant  factor  which  increases  matching  costs  for 
incorrect  object-model  associations. 

3.8  Sequence  Comparison  Concept  Relationships 

At  this  point,  we  have  established  four  concepts  for  the  use  of  dynamic  programming 
in  dynamic  object  and  target  recognition:  (1)  The  Larson  and  Peschon  approach,  (2) 
classical  sequence  comparison  techniques  (including  dynamic  time  warping),  (3)  the  Le 
Chevalier  ( et  al.)  approach,  and  (4)  the  Mieras  (et  al.)  approach.  This  section  will 
illuminate  the  differences  and  similarities  in  these  approaches. 

Fig.  3.10  is  provided  as  an  aid  for  this  discussion.  It  is  intended  to  show  relationships 
or  the  lack  thereof  between  concepts  that  relate  to  the  author’s  original  research,  which 
is  noted  in  the  boxed  region.  Fundamentally,  the  author’s  research  blends  concepts  from 
two  fields  -  linear  (and  linearization-based)  estimation  and  dynamic  programming.  The 
author’s  conceptualization  of  dynamic  programming  as  a  tool  for  moving  object  recogni¬ 
tion  was  preceded  independently  by,  but  includes  and  transcends,  the  developments  of  Le 
Chevalier  and  Mieras  -  to  represent  them  specifically  as  applications  of  dynamic  program¬ 
ming  sequence  comparison  for  automatic  object  recognition  (AOR)  of  dynamic  objects, 
these  names  are  contained  in  double  boxes. 

The  key  point  expressed  in  the  upper  portion  of  Fig.  3.10  is  that  dynamic  program¬ 
ming  sequence  comparison  can  be  seen  to  be  applicable  to  multisensor  fusion  for  dynamic 
object  recognition  through  at  least  three  different  paths.  The  leftmost,  or  speech  process- 
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Figure  3.10.  Relating  Earlier  Developments  to  the  Author’s  Research  (AOR:  Automatic 
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ing  path  (the  author’s  original  inspiration),  lists  researchers  whose  efforts  were  discussed 
in  Sect.  2.4.2.  The  center  path  is  Le  Chevalier  (Sect.  2.5.7),  whose  syntactic  approach  is 
also  fundamentally  language-inspired.  The  rightmost  path  is  that  begun  by  Larson  and 
Peschon,  whose  equations  were  applied  for  tracking  by  Bamiv,  as  described  in  a  recent 
text  on  multisensor  fusion  [8],  and  which  served  through  earlier  sources  [12,  13,  129]  as  the 
inspiration  for  Mieras  (Sect.  2.5.8). 

The  lower  portion  of  Fig.  3.10  refers  to  the  origins  of  the  linear  or  linearization-based 
estimation  and  estimator  evaluation  techniques  applied  in  this  research,  as  discussed  in 
Sect.  2. 3. 1.1  and  2.7,  respectively.  It  should  be  clear  also  that  Bayes’  Rule  plays  a  key  role 
in  the  basic  Larson  and  Peschon  equations,  but  this  connection  is  not  shown  for  reasons  of 
clarity.  The  fundamental  contribution  of  this  research  is  the  fusion  of  information  from  the 
upper  and  lower  portions  of  this  figure,  using  Bayesian  methods  where  possible,  to  provide 
new  understanding  of,  and  new  approaches  for,  dynamic  object  recognition. 

3.8.1  Contrasting  Prior  Efforts  with  the  Author’s  Research.  We  now  address  the 
relationship  between  the  Larson  and  Peschon  approach  advanced  in  Sect.  3.6,  on  one  hand, 
and  the  Le  Chevalier  and  Mieras  concepts  on  the  other  hand.  Simply,  it  is  clear  that  both 
the  Le  Chevalier  and  Mieras  concepts  are  sub-optimal  implementations  of  the  Larson  and 
Peschon  equations,  in  aspect  angle  space. 

Le  Chevalier’s  development  effectively  reduces  the  Larson  and  Peschon  conditional 
transition  likelihood  p(xfc+1  |  xk)  to  “evolutionary  constraints”  of  unspecified  form,  and  his 
measurement-to-possible  origin  state  comparisons  do  not  preserve  the  general  probabilistic 
meaning  the  Larson  and  Peschon  term  p( zk+1  |  x*+1),  although  Le  Chevalier’s  inter¬ 
signal  Chi-square  metric  is  related  to  this  form  of  a  likelihood.  Similarly,  Mieras  has 
reduced  the  transition  likelihood  to  a  hard  “yes/no”  association  limit  based  on  whether  or 
not  an  earlier  measurement /aspect  angle  pair  (or  terminus  of  a  path  of  pairs)  lies  within 
the  “association  gate”  of  a  later  measurement /aspect  angle  pair  (accounting  evidently  also 
for  an  aspect  angle  bias  in  the  appar  ent  direction  of  motion  [162,  163]).  Like  Le  Chevalier, 
Mieras  uses  a  probabilistic  measurement-to-state  metric  (as  discussed  in  Sect.  2.2.3). 
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In  effect,  Le  Chevalier  and  Mieras  effectively  replace  the  aspect  angle  state  transition 
likelihood  p(xJ+li„  |  xJ)n,  Z^,  u\)  with  a  uniform  probability  density,  the  extent  of  which  is 
defined  by  aspect  angle  transition  bounds  allowable  by  vehicle  kinematics.  For  Le  Cheva¬ 
lier,  where  the  aspect  angle  transition  bounds  are  evidently  circular  around  each  candidate 
current  or  prior  aspect  angle  state,  the  uniform  probability  density  can  be  further  described 
as  zero  mean.  For  Mieras,  the  mean  is  biased  in  the  apparent  direction  of  motion.  For 
both  Le  Chevalier  and  Mieras,  the  probability  of  starting  cell  location  p(x0,„  j  Z^,u>j)  is 
treated  as  equal  for  all  starting  states,  or  effectively  ignored.  This  “fixed  bound”  approach 
is  illustrated  in  Fig.  3.11,  using  the  multiple- window  format  common  to  earlier  figures  in 
this  chapter. 

Recalling  our  comments  in  Sect.  2.6.2,  we  can  see  that  this  fixed  bound  approach  is  a 
reasonable  restriction  of  the  domain  of  the  likelihood  functions  which  match  feature  observ¬ 
able  measurements  to  object  signature  libraries.  Thus,  fixed  bound  restrictions  promise 
improved  recognition  performance  over  algorithms  which  do  not  restrict  that  domain.  How¬ 
ever,  these  fixed  bound  approaches  cannot  make  full  use  of  the  information  content  in  the 
transition  likelihood  p(xJ+li„  |  x,-,n,  Z^, <*;<).  As  we  will  see  in  Chapter  V,  the  practical 
effect  of  this  suboptimality  is  that,  where  high  quality  transition  likelihood  information  is 
available,  fixed  bound  algorithms  have  a  greater  tendency  than  more  optimal  approaches 
to  allow  unlikely  aspect  angle  transitions.  This  means  that,  in  comparison  to  more  optimal 
approaches,  fixed  bound  approaches  will  have  a  greater  tendency  to  misclassify  a  target, 
or  to  estimate  the  aspect  angle  sequence  on  the  true  target  incorrectly. 

3.8.2  Contrasting  Classical  Sequence  Comparison-Based  vs.  Larson  and  Peschon- 
Based  “Motion  Warping”  Concepts.  At  this  point,  we  can  express  the  difference  between 
classical  sequence  comparison-based  and  the  Larson  and  Peschon-based  “motion  warping” 
concepts  (including  those  of  Le  Chevalier  and  Mieras)  in  terms  of  how  two  sequences 
(observed  and  library  model-derived)  are  compared  to  each  other,  and  what  information 
is  used. 

This  comparison  revolves  around  two  points:  recall  the  discussion  in  Sect.  2.4.5  to 
the  effect  that  (1)  the  Larson  and  Peschon  equations  are  an  elaborate  forward  dynamic 
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programming-based  sequence  comparison  technique  with  a  probabilistic  metric  and  con¬ 
tinuity  rules,  and  (2)  that  the  continuity  rules  in  the  Larson  and  Peschon  equations  are 
further  restricted  to  require  deletion  of  as  many  points  as  required  from  the  library  set  to 
match  the  unknown  set,  point  for  point. 

First,  in  classical  sequence  comparison-based  motion  warping,  we  “pre-warp”  the 
aspect  angle  space  using  a  priori  kinematic  information  in  a  preliminary  attempt  to  predict 
a  sequence  of  observations,  and  then  use  dynamic  time  warping-like  techniques  to  make 
the  final  comparison.  In  the  Larson  and  Peschon-based  concepts,  we  warp  the  observations 
to  fit  a  like  number  of  elements  in  the  aspect  angle  space.  Use  of  kinematic  information  in 
L&P  techniques  may  vary  from  explicit  inclusion  of  transition  likelihoods  at  one  extreme, 
to  simple  feasibility  bounding  at  the  other  extreme. 

More  specifically,  in  classical  sequence  comparison  the  “pre-warping”  process,  i.e., 
laying  out  the  aspect  angle  space  along  the  kinematically-estimated  nominal  aspect  angle 
path,  has  the  effect  of  organizing  the  aspect  angle  space  according  to  the  maximum  prob¬ 
able  transition  rate,  or  simply  reordering  the  x£  in  time  in  accordance  with  the  maximum 
p(x£+1  |  x£)  for  all  xjj.  The  motion  warping  process  then  considers  deviations  around  this 
nominal.  This  is  the  key  difference  between  classical  sequence  comparison  and  a  Larson 
and  Peschon-type  (Le  Chevalier/Mieras)  approach,  which  works  on  an  “unwarped”  aspect 
angle  space  directly,  considering  p(x  2+i  I  *2)  or  some  less  elaborate  transition  constraint 
explicitly  in  each  transition  decision. 

Second,  compare  Figs.  3.3  and  3.4  to  Fig.  3.8  to  observe  how  classical  sequence 
comparison  in  general  forces  every  library  signature  along  some  contiguous  aspect  angle 
path  to  associate  with  the  observed  measurements,  while  L&P-based  techniques  can  allow 
the  algorithm  to  ignore  unlikely  associations.  The  potential  pitfalls  of  this  issue  were 
observed  in  Sect.  3.6.6.  The  “basic”  Larson  and  Peschon  concept  is  a  sequence  comparison 
technique  calling  for  m  observations  to  be  matched  one-for-one  with  exactly  m  points  on  a 
model.  Thus,  as  implied  above,  Larson  and  Peschon-type  approaches  (including  those  of  Le 
Chevalier  and  Mieras)  can  result  in  picking  a  “nice”  sequence  of  points  from  an  otherwise 
inappropriate  model.  Classical  sequence  comparison,  or  path  warping  concepts,  however, 


3-50 


may  be  less  likely  to  make  such  an  inappropriate  choice,  due  to  their  tendency  to  allow 
association  of  dissimilar  numbers  of  the  elements  of  the  two  compared  sequences  [195]. 

These  differences  are  perhaps  indicative  of  a  fundamental  difference  between  the 
L&P  approach  and  classical  sequence  comparison.  Classical  sequence  comparison  is  based 
on  comparing  discrete  observations  of  an  underlying  continuous  or  piecewise  continuous 
feature  space,  at  an  observation  frequency  which  (in  a  Shannon  sampling  theory  sense) 
captures  the  spatial  variational  trends  in  the  feature  observable  values.  The  L&P  approach 
does  not  require  that  assumption,  and  indeed  seems  designed  to  make  the  best  possible 
decision  where  that  assumption  cannot  be  made. 

Finally,  comparing  classical  sequence  comparison  side-by-side  with  a  Larson  and 
Peschon-type  algorithm  reveals  a  subtle  possible  practical  advantage  for  the  classical  ap¬ 
proach.  Note  that,  in  the  one-dimensional  classical  algorithm  with  “basic”  continuity 
constraints  (Fig.  3.8),  each  aspect  angle  cell  in  each  window  has  at  most  two  possible 
predecessors,  compared  to  Larson  and  Peschon  (L&P)-type  algorithms  (including  the  Le 
Chevalier  and  Mieras  algorithms),  where  predecessors  for  each  aspect  angle  cell  in  each 
window  may  include  many  cells  in  the  previous  window  (particularly  some  which  imply 
“backward”  motion  of  the  object).  Since  dynamic  programming  computational  load  is 
driven  largely  by  the  number  of  states  and  predecessors  ( dimensionality ),  it  is  clear  that 
classical  “continuous”  sequence  comparison  with  simple  continuity  constraints  may  have  a 
potential  for  reduced  computational  load. 

3.8.3  Comparing  Use  of  Kinematic  Information  in  the  Author’s  Research  with  the 
Le  Chevalier  /  Mieras  Concepts.  Neither  the  Le  Chevalier  nor  Mieras  approaches  make 
optimum  use  of  the  rich  information  available  from  kinematic  measurements  regarding  as¬ 
pect  angle  states  and  transitions,  the  approach  that  forms  the  core  of  the  author’s  research. 

Le  Chevalier’s  writings  do  not  specify  how  the  “evolutionary  constraint”  parameters 
would  be  obtained  for  the  target  identification  case.  As  noted  in  Sect.  2.5.7,  his  insistence 
on  “real  time”  operation  and  his  apparent  unwillingness  to  model  and  propagate  the  target 
kinematic  state  imply  that  he  does  not  use  a  smoother  or  target  kinematic  state  estimator, 
and  therefore  has  little  knowledge  of  the  likely  aspect  angle  transition  of  the  observed 
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target  -  only  that  that  transition  is  bounded.  In  the  Mieras  approach,  the  “evolutionary 
constraints”  are  apparently  fixed,  but  the  constraints  Me  evidently  biased  in  the  direction 
expected  from  kinematic  tracking  [162,  163]. 

In  motion  warping  with  the  full  Larson  and  Peschon  approach,  the  “evolutionary 
constraints”  are  functions  of  the  object  kinematic  state  and  state  covariance  estimates,  as 
derived  from  an  extended  Kalman  filter/smoother  operating  as  discussed  in  Sect.  3.6.5. 
This  procedure  will  be  described  in  full  detail  in  Chapter  V.  Thus,  kinematic  information 
is  maintained  and  fused  in  the  same  Bayesian  probabilistic  framework  as  feature  observable 
information.  With  “classical”  sequence  comparison  approaches,  we  at  least  constrain  the 
matching  process  to  follow  aspect  angle  trajectories  w;th  maximum  a  priori  likelihood 
P(x£+i  I  *»)• 

3.8.4  Advantages  of  Predecessor  Approaches.  The  reader  must  not  be  lei.,  with 
the  impression  that  the  Le  Chevalier  and  Mieras  approaches  would  be  less  effective  in 
every  case  them  full  or  “optimal”  implementation  of  the  Larson  and  Peschon  approach 
or  classical  sequence  comparison  methodology.  As  with  every  class  of  pattern  recognition 
algorithms,  the  decision  to  employ  more  or  less  complex  implementations  requires  trade¬ 
offs  considering  feature  space,  information  and  time  available,  computational  burden,  and 
so  on. 

As  our  results  will  show,  basic  L&P  approaches  like  those  of  Le  Chevalier  and  Mieras 
can  make  a  substantial  discrimination  improvement  by  restricting  the  wild  aspect  angle 
transitions  (and  unreasonable  low-cost  matches)  attempted  on  incorrect  object  classes  by 
an  “independent-look”  recognizer  (i.e.,  a  conventional  matching  algorithm,  as  described  in 
Sect.  2.2.1)  working  in  a  noisy  signature  domain.  However,  these  “aspect  angle  bound” 
algorithms  can  allow  apparent  aspect  angle  transitions  that  are  inconsistent  with  the  ob¬ 
served  kinematics,  such  as  aspect  angle  sequences  that  stop  or  move  in  the  opposite  di¬ 
rection  from  that  implied  by  the  observed  kinematics  -  even  when  a  set  of  observations  is 
matched  to  the  correct  object  class. 

These  effects  are  often  exhibited  in  our  tests,  particularly  for  incorrect  matches,  and 
suggest  that  subsequent  processing  of  these  aspect  angle  sequences  and/or  inclusion  of 
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additional  (ideally  independent)  feature  observable  measurements  in  each  zs  is  warranted 
to  improve  classification.  The  Mieras  algorithm  is  believed  to  apply  the  former,  or  sequence 
processing  approach  [162],  which  may  be  a  significant  improvement  over  the  approach  of  Le 
Chevalier  et  al.  [136].  It  is  important  to  note  that  the  tests  conducted  by  Le  Chevalier  et 
al.  were  conducted  in  a  one- dimensional  (great  circle)  aspect  angle  space,  where  this 
“wandering”  would  have  been  less  noticeable. 

The  apparent  advantage  accrued  by  these  “aspect  angle  bound”  approaches  is  that, 
theoretically,  they  can  be  made  to  be  “real  time”,  since  they  do  not  require  the  time  de¬ 
lay  required  to  develop  the  kinematic  measurement-derived  aspect  angle  rate  estimate,  or 
p(xJ+li„  |  xJin,  Z^,u)j)  given  by  a  kinematic  tracker /smoother  combination.  As  we  noted 
above,  however,  for  aircraft  targets,  body  angular  rates  can  be  on  the  order  of  hundreds 
of  degrees  per  second,  and  are  unobservable  to  the  kinematic  trackers  generally  used  to 
find  “global”  aspect  limits  or  “windows”  for  feature  observable-matching  recognition  algo¬ 
rithms.  Thus,  it  seems  clear  that  one  may  need  to  accept  delays  of  up  to  a  few  seconds 
and  some  form  of  smoothing  to  provide  any  reliable  aspect  angle  estimates  from  kinematic 
(translational)  measurements  alone.  If  the  target  is  determined  to  be  turning  during  this 
period,  that  kinematic  information  can  and  should  be  used  explicitly. 

Chapter  V  will  contrast  the  performance  of  various  conventional  and  dynamic  pro¬ 
gramming-based  pattern  recognition  approaches  using  generalized  ambiguity  functions. 
The  following  section  describes  how  multiple  model  Kalman  filter  parameter  estimators 
and  dynamic  programming  sequence  comparison  methods  can  be  combined  to  address 
shortcomings  in  each  respective  approach.  The  result  is  a  a  new  class  of  estimators  that 
can  fuse  kinematic  and  ambiguous  feature  observable  information  in  real  time. 

3.9  A  New  Class  of  “Coupled”  Estimators  for  Object  Recognition 

In  Sect.  3.4,  we  used  Kendrick-type  kinematic/aspect-angle  estimators  in  a  classi¬ 
cal  multiple  model  /  residual  analysis  structure  to  give  an  expression  for  the  probability 
of  object  class  membership  via  Bayesian  techniques,  where  the  new  information  is  given 
by  kinematic  measurements  and  aspect  angle  pseudo-measurements,  conditioned  (implic¬ 
itly)  on  feature  space  measurements.  Conversely,  in  Sect.  3.6,  we  used  the  Larson  and 
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Peschon  (L&P)  approach  and  Bayesian  methods  to  find  an  expression  for  the  probability 
of  object  class  membership,  where  the  new  information  is  given  by  feature  space  measure¬ 
ments,  conditioned  on  the  likelihood  of  aspect  angle  transitions,  as  implied  by  kinematic 
measurements. 

We  observe  that  Kendrick- type  kinematic/aspect-angle  estimators  work  poorly  where 
inadequate  estimates  of  aspect  angle  are  provided,  and  conversely  that  L&P  estimators 
work  poorly  where  inadequate  a  priori  information  p(xk+1  |  xk)  is  provided.  If  these  two 
forms  of  information  processor  are  used  jointly,  however,  we  can  use  Bayesian  techniques  to 
obtain  a  practical  expression  for  the  probability  of  class  membership  conditioned  jointly  on 
feature  space  and  kinematic  measurements.  This  is  a  major  objective  of  this  dissertation 
research,  and  provides  Step  Three  in  the  three  step  process  outlined  at  the  start  of  this 
chapter. 

These  last  observations  allow  us  to  envision  a  new  class  of  “coupled”  filter  structures, 
in  which  conventional  filters  and  /or  smoothers  perform  linear  or  linearized  estimation  tasks 
appropriate  to  them,  and  counterpart  Larson  and  Peschon  estimators  perform  nonlinear 
estimation  tasks  to  which  they  are  suited.  The  key  point  to  recognize  is  that  the  L&P 
aspect  angle  path  Xkfu.  in  fact  provides  a  maximum  likelihood  aspect  angle  estimate  for 
each  object  model  u>i  at  any  time  tk .  This  aspect  angle  information  from  the  L&P  estimator 
can  be  passed  to  the  conventional  filter  as  a  pseudo-measurement,  and  information  from 
the  conventional  filter  is  made  available  to  the  L&P  estimator  as  “a  priori”  information. 
Judgments  as  to  the  “joint  likelihood”  of  measurements  and  states  can  be  made  then,  using 
information  from  both  sources  concurrently. 

At  any  particular  measurement  time  tk,  an  estimate  ( estimate  because  we  consider 
the  L&P  path  only,  as  in  Eqn.  (3.18))  for  the  joint  likelihood  of  measured  kinematic  and 
feature  observables  and  selected  state  values  is  given  by: 


p(Xj£t.,  Z{X,Xl\Zdk+,_l,Z{_1,wi)  = 

p(XLJUi,Z{  I  ^Y[\P(Zn,K  I  Zn  +  .-xX-l,".)]} 


(3.20) 
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so  that,  using  Bayes’  Rule,  we  obtain  the  desired  a  posteriori  probability  of  class  member¬ 
ship  as: 


p(u,i\Z{X+.)  = 

{p(X^,Z{  I  WtK  1  ^n+.-l)  Zn_1,q>»)]}p(tt>i) _ 

Ei,J=i{p(X^.Z{  |  Zf+,_lt  z{_u u>j) Un=M*t xTn  I  Zi+.-i,Z '-u^ljpfa) 


(3.21) 


where  all  quantities  are  as  defined  earlier.  Note,  however,  that  since  the  Larson  and  Peschon 
transition  likelihoods  are  now  functions  of  previous  feature  observable  measurements,  the 
Larson  and  Peschon  joint  probability  density  expression  p(X^.,Z{  |  Z^+a_t,  Z{_1,u>i) 
now  contains  the  term  Z{_1  as  a  conditioning  argument,  which  it  did  not  in  Sect.  3.6. 

It  is  important  to  note  that  the  representation  given  in  Eqn.  (3.21)  is  but  one  in  a 
class  of  such  representations.  First,  following  the  comments  in  Sect.  3.6,  we  may  choose 
to  add  contributions  to  the  likelihood  of  the  observed  feature  measurements  from  paths 
other  them  the  L&P  path  X£f  . .  Second,  the  choice  of  kinematic /aspect  estimator  models 
is  completely  open  to  the  designer.  The  estimator  to  be  demonstrated  in  Chapter  IV  is, 
for  example,  similar  to  but  also  significantly  different  from  the  Kendrick/Maybeck/Reid 
and  Andrisani  et  al.  estimators. 

A  key  point  about  this  recognizer  is  that  theoretically,  it  can  work  in  real  time,  un¬ 
like  the  recognizers  discussed  in  Sect.  3.6.5  and  Chapter  V,  which  nominally  must  employ 
smoothers  against  highly  dynamic  objects  (i.e.,  aircraft).  To  attain  the  same  state  estima¬ 
tion  accuracy,  however,  such  read-time  designs  would  require  in  general  considerably  more 
complicated  filter  models  than  smoother-based  designs. 

3.10  Assessing  the  Proposed  kecognizers  as  Syntactic  Approaches 

Recalling  the  discussion  in  Sect.  2.2.2,  it  becomes  clear  that  each  of  the  three  proposed 
object  recognition  approaches  is  syntactic  in  nature.  For  the  second  and  third  approaches, 
employing  classical  or  Larson  and  Peschon  “variant”  dynamic  programming-based  sequence 
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comparison,  this  relationship  is  made  clear  by  the  association  of  DP  sequence  comparison 
and  syntactic  approaches  in  Miclet  [161]  and  Le  Chevalier  [136]. 

The  first  approach,  using  conventional  filter  residuals  and  related  quantities  only,  is 
recognized  to  be  syntactic  when  we  consider  that  presentation  of  observed  quantities  in 
the  proper  time  order  to  the  J  filters  is  absolutely  essential  to  their  function.  As  with  an 
automaton  classically  used  in  syntactic  recognition,  the  correct  conventional  filter  model 
reveals  itself  by  remaining  in  an  acceptable  “state”  as  it  processes  the  observed  sequence  - 
this  acceptable  state  is  simply  obedience  to  proper  behavior  of  the  residual  sequence  as 
discussed  in  Sect.  2.3. 1.3.  This  observation  is  equivalent  to  that  made  by  THrrien  [211],  in 
which  he  associated  residual  sequence  analysis  methods  with  classical  syntactic  theory  [88]. 

3.11  Evaluating  Algorithm  Performance 

3.11.1  Introduction.  This  section  provides  approaches  for  evaluating  the  per¬ 
formance  of  the  object  recognition  algorithms  proposed  in  this  chapter.  In  particular,  we 
wish  to  motivate  the  use  of  the  generalized  ambiguity  function  (GAF )  for  evaluating  algo¬ 
rithm  performance,  and  for  providing  insights  not  available  from  conventional  evaluation 
techniques. 

3.11.2  Conventional  Performance  Evaluation.  The  conventional  approach  to 
evaluating  object  or  target  recognition  algorithms  is  to  (1)  define  objects  or  targets  of 
interest,  (2)  obtain  real  or  simulated  sensor  data  for  these  objects,  (3)  choose  a  set  of 
noise- corrupted  measurements  from  one  particular  object  to  represent  a  true,  unclassified 
a  priori  object,  and  (4)  evaluate  the  ability  of  any  given  algorithm  to  identify  correctly  the 
object  that  generated  those  measurements.  A  correct  identification  event  may  be  defined  as 
one  in  which  the  value  output  by  an  algorithm  “timed”  for  the  correct  object  (unknown  a 
priori)  is  higher  than  any  of  the  values  output  by  the  same  algorithm  “tuned”  for  incorrect 
object  classes.  Conversely,  an  incorrect  identification  is  one  in  which  an  improperly-tuned 
algorithm  gives  a  higher  output  value  than  that  of  the  properly- tuned  algorithm.  Thresh¬ 
olds  may  be  established  for  the  differences  between  two  or  more  recognizers  to  meet  some 
confidence  level  prior  to  making  a  decision. 
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This  approach  typically  quantifies  performance  in  terms  of  percentage  of  correct  or 
incorrect  identification  for  any  algorithm,  true  object,  measurement  set,  and  false  object 
set.  The  “best”  algorithm  is  typically  taken  to  be  the  one  which  has  the  highest  percentage 
or  probability  of  correct  identification  for  some  particular  set  of  objects  and  scenarios.  Each 
of  the  approaches  considered  earlier  in  this  chapter  can  be  evaluated  in  this  fashion. 

3.11.3  Performance  Evaluation  with  the  Generalized  Ambiguity  Function.  On  the 
other  hand,  the  generalized  ambiguity  function,  as  discussed  in  Sect  2.7,  presents  an  entirely 
different  approach  for  analyzing  object  recognition  system  performance.  To  understand 
this,  we  must  relate  the  parameter  estimation  concepts  of  (1)  states,  (2)  parameters,  and 
(3)  measurements  to  the  corresponding  forms  appropriate  for  object  recognition.  Then,  we 
will  consider  object  recognition  algorithms  as  generators  of  likelihood  functions,  the  mean 
values  of  which  for  particular  measurement  sets  and  scenarios  define  generalized  ambigu¬ 
ity  functions.  Finally,  we  will  discuss  how  generalized  ambiguity  functions  demonstrate 
improved  recognition  through  fusion  of  kinematic  and  sensor  signature  information. 

All  object  recognition  algorithms  cam  in  fact  be  interpreted  as  likelihood  functions  in 
the  sense  used  by  Rao  [184:353]  and  Maybeck  [154:75]-  whether  or  no*  their  outputs  provide 
the  classical  likelihood  value  p(z  |  12),  i.e.,  the  probability  density  of  some  measurement  z 
given  that  the  state  and  parameter  set  f2  is  being  observed.  As  discussed  in  Sect.  2.6.2,  all 
that  we  require  is  that  the  output  value  of  a  likelihood  function  L,  defined  for  a  particular 
set  f2  of  states  and  parameters,  operating  on  measurements  from  a  process  with  that 
set  of  states  and  parameters,  should  be  greater  than  the  output  value  of  any  analogous 
likelihood  function  defined  for  another  set  of  states  and  parameters,  operating  on  the 
(same)  measurement  set  from  process  fl. 

The  particular  form  of  object  with  which  we  are  most  concerned  in  this  research  is  the 
fixed  wing  aircraft.  The  concepts  of  “states”  in  classical  aircraft  recognition  and  parameter 
estimation  do  not  differ  widely.  In  aircraft  recognition,  the  fundamental  states  of  interest 
are  the  six  degrees  of  freedom  translatln  and  rotation  with  respect  to  some  reference 
(and  higher  derivatives  of  these  quantities)  for  the  target  aircraft.  Taken  together  with 
the  corresponding  states  for  the  sensor  platform,  these  states  dictate  the  appearance  of  the 
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target  aircraft  to  the  sensor.  Other  “states”  may  be  of  interest  as  well  -  engine  speed  and 
temperature,  orientation  of  objects  on  the  target  relative  to  the  target  body  frame,  and  so 
on.  Similarly,  the  concepts  of  measurements  in  aircraft  target  recognition  and  parameter 
estimation  are  identical. 

The  concept  of  “parameters”  in  object  recognition,  however,  is  not  well  defined  - 
extending  this  concept  is  key  to  this  research.  Intuitively,  a  real  object  must  only  be  repre¬ 
sentable  as  an  infinite  dimensional  vector,  and  there  may  be  any  number  of  ways  to  define 
“basis  vectors”  or  individual  dimensions  in  the  corresponding  vector  space.  For  example, 
for  aircraft  targets,  one  might  define  an  abstract  parameter  space  based  on  classical  aircraft 
design  features  -  fuselage  length,  diameter,  wingspan,  and  so  on.  Clearly,  such  a  parameter 
space  could  have  both  continuous  and  discrete  (e.g.,  number  of  engines)  attributes.  One 
set  of  basis  vectors  might  serve  as  well  as  some  other  to  define  objects  and  behavior  of 
interest.  A  particular  question  arises  immediately  regarding  fidelity  of  necessarily  finite¬ 
dimensional  models  versus  infinite-dimensional  truth  -  how  many  parameters  of  what  kind 
are  required  for  a  model  to  achieve  a  given  level  of  closeness  (with  respect  to  some  metric) 
to  true  behavior?  This  research  does  not  address  that  question  per  se,  but  will  motivate 
the  need  for  further  research  in  that  direction  in  Chapter  VI. 

Classical  object  recognition  evaluation  approaches,  as  discussed  in  the  previous  sec¬ 
tion,  sidestep  the  question  of  parameter  spaces  by  considering  only  discrete  points  in  that 
space  corresponding  in  some  sense  to  known  objects.  Classical  approaches  evaluate  the 
performance  of  recognition  algorithms  at  these  points,  but  not  elsewhere. 

In  evaluating  a  recognition  algorithm,  however,  the  performance  of  that  algorithm 
against  objects  in  some  sense  “in  between”  real  objects  of  interest  should  be  importan 
as  well.  For  example,  in  general  we  should  probably  prefer  (1)  a  “robust”  algorithm  that 
returns  high  likelihood  function  values  for  objects  close  in  some  sense  to  the  design  object, 
over  (2)  an  algorithm  which  fails  utterly  (returns  low  likelihood  values)  when  presented 
with  measurements  from  a  object  with  only  minor  variations  from  the  design  object.  On 
the  other  hand,  if  we  truly  wish  to  identify  small  variations  from  some  design  point,  such 
a  robust  algorithm  may  be  utterly  inadequate.  Consider  the  case  in  which  we  desire  to 
distinguish  an  F-4G  (“Wild  Weasel”  air  defense  suppression  variant)  Phantom  II  from 
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an  F-4E  (standard  multirole  tactical  fighter)  Phantom  II:  a  robust  algorithm  would  be 
counterproductive. 

The  intention  here  is  not  to  propose  definitive  models  or  rules  for  considering  object 
parameter  spaces.  The  point  is  simply  that  the  use  of  the  generalized  ambiguity  function  in 
object  recognition  forces  one  to  consider  the  existence  and  significance  of  these  spaces,  and 
provides  a  natural  tool  for  evaluating  recognition  algorithm  performance  as  the  algorithms  - 
likelihood  functions  -  Eire  “timed”  over  various  domEiins  in  these  spaces.  In  Chapter  VI, 
we  will  consider  other  implications  of  the  concept  of  object  parEimeter  spaces. 

The  aircraft  model  format  used  in  Chapter  V  to  evaluate  “motion  wiirping”  algo¬ 
rithms  offers  one  natural  approach  for  defining  “pseudo-objects”  or  object  peirameter  sets 
in  some  sense  in  between  real  objects  of  interest.  That  chapter  will  discuss  the  definition 
of  pEirticular  objects  Emd  “pseudo-objects”  of  interest.  Likelihood  function  values  Eire  then 
defined  for  each  of  these  points,  and  output  VEilues  are  found  for  each  likelihood  function 
operating  on  measurement  sets  from  a  “true”  object.  Note  that  likelihood  functions  Eire 
distinguished  here  from  one  Einother  according  to  (1)  their  form  or  structure  (the  way 
they  use  information);  and  for  functions  with  the  same  structure,  (2)  the  particulEir  object 
signatures  they  are  tuned  to  identify. 

It  is  important  to  note  that  this  will  entEiil  a  “Monte  Carlo”  evaluation  of  the  general¬ 
ized  Eimbiguity  function  -  analytical  evEiluations  of  the  integral  expression  Eqn.  (2.37)  for  a 
“motion  warping”  likelihood  function  appear  intractable,  due  to  the  presence  of  numerous 
nonlinearities.  For  scenarios  in  which  a  completely  lineEir,  Gaussian  description  of  the  joint 
conditional  density  fz\n,{Z  I  fit)  could  be  obtained,  however,  Em  analytical  evaluation  of 
the  ambiguity  function  (Eqn.  (2.37))  for  this  classical  likelihood  function  (in  natural  log 
form)  also  could  be  obtained,  as  in  [152], 

Now,  consider  contrasting  (1)  the  ambiguity  function  for  an  object  recognition  al¬ 
gorithm  which  fuses  kinematic  and  feature  observable  information  with  (2)  the  ambiguity 
function  for  an  object  recognition  algorithm  which  uses  feature  observable  information 
only.  How  should  they  differ? 
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The  discussion  in  Sect:  2.6.2  answers  this  question  -  when  a  set  of  measurements  is 
matched  to  the  wrong  object  model,  or  “point”  in  parameter  space,  a  kinematic/feature 
observable  fusion  algorithm  is  more  likely  to  yield  lower  likelihood  function  values  than 
one  which  does  not  consider  kinematics.  With  reference  to  Eqn.  (2.37),  we  consider  that 
as  the  parameters  y  in  ft  change,  a  likelihood  function  restricted  by  kinematics  will  have 
a  smaller  chance  of  (improperly)  finding  a  high  likelihood  function  value  somewhere  in  a 
large  allowable  aspect  angle  extent  than  a  likelihood  function  which  is  not  restricted  by 
kinematics. 

Since  the  generalized  ambiguity  function  is  the  mean  value  of  the  likelihood  function, 
then,  the  kinematic/feature  observable  fusion  algorithm  should  provide  a  more  sharply 
peaked,  or  less  “broad”  generalized  ambiguity  function,  having  greater  curvature  at  the 
correct  parameter  value  than  methods  that  do  not  fuse  information  from  feature  observ¬ 
ables  and  motion.  This  reduced  ambiguity  implies  increased  discrimination  capability 
for  the  recognition  algorithm  which  fuses  kinematic  and  feature  observable  information. 
Chapter  V  will  show  this  behavior  graphically. 

Finally,  recall  from  Sect.  2.7  that  the  curvature  of  the  generalized  ambiguity  function 
at  the  correct  parameter  value  is  directly  related  to  the  Cramer-Rao  lower  bound  for 
the  covariance  of  a  parameter  estimate  by  that  likelihood  function.  This  means  that  by 
choosing  to  consider  continuous  parameter  spaces  in  object  and  target  recognition,  and 
by  using  the  generalized  ambiguity  function,  we  can  define  the  Cramer-Rao  lower  bound 
as  a  measure  of  relative  performance  for  our  recognition  algorithms  -  whether  or  not 
the  likelihood  function  is  the  classical  likelihood  function  p(z  |  <*;<).  In  classical  target 
recognition  at  least,  this  approach  has  evidently  never  been  proposed. 

3.12  Chapter  Summary 

The  goals  of  this  chapter  were  to  propose  new  approaches  for  recognition  of  dy¬ 
namic  objects  in  general  and  aircraft  targets  in  particular.  These  new  approaches  were 
advanced  by  defining  estimators  to  provide:  (1)  the  likelihood  of  kinematic  measurements 
and  pseudo-measurements  conditioned  on  feature  observables,  (2)  the  likelihood  of  fea¬ 
ture  observable  measurements  conditioned  on  kinematic  measurements,  and  (3)  the  joint 
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likelihood  of  kinematic  and  feature  observable  measurements.  We  then  used  these  likeli¬ 
hoods  with  Bayes’  Rule  and  a  priori  probabilities  to  propose  object  recognition  algorithms. 
Finally,  we  considered  tools  for  evaluating  the  performance  of  these  algorithms. 

The  following  two  chapters  will  show  results  from  algorithms  of  the  first  two  types, 
indicating  the  potential  for  improved  object  recognition  from  all  three  approaches,  by 
comparison  where  possible  with  classical  techniques  that  do  not  fuse  kinematic  and  feature 
observable  information.  Both  classical  techniques  and  generalized  ambiguity  functions  will 
be  used  to  assess  performance. 

The  theoretical  and  practiced  contributions  of  the  effort  described  in  this  chapter  are 
major.  They  encompass  the  essence  of  this  dissertation,  and  include: 

(1)  Extension  of  conventional  multiple  model  residual  sequence  analysis  techniques  and 
kinematic/aspect-angle  trackers  to  provide  new  methods  for  object  and  target  recognition, 
in  particular  where  sensor  measurements  are  not  linearly  predictable. 

(2)  Extension  of  the  Larson  and  Peschon  equations  to  provide  new  methods  for  object 
recognition  using  measurements  from  ambiguous  feature  observable  spaces,  considering  a 
priori  information  from  kinematics  and  other  sources  as  to  the  likelihood  of  transitions  on 
the  underlying  aspect  angle  state  space. 

(3)  Extensions  of  the  theory  and  practice  of  classical  sequence  comparison  to  include  feature 
observable  sequences  arising  from  an  aspect  angle  subspace. 

(4)  Combination  of  the  Larson  and  Peschon  equations  with  conventional  linear  estimators 
to  provide  a  new  form  of  estimator,  suitable  in  particular  for  object  recognition  with  am¬ 
biguous  feature  observables,  generated  from  dynamic  subspaces  that  exhibit  linear  behavior 
in  some  respects. 

(5)  Through  contributions  (1)  through  (4)  and  application  of  Bayes’  Rule,  several  new 
approaches  for  multisensor  fusion  to  obtain  an  a  posteriori  probability  of  object  class 
membership,  conditioned  jointly  on  kinematic  and  “nonkinematic”  or  feature  observable 
information  and  a  priori  information  for  each  known  object  class. 

(6)  Identification  of  a  new  method  for  evaluating  object  recognition  algorithms  -  the  gen¬ 
eralized  ambiguity  function. 
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(7)  Extension  of  classical  parameter  space  concepts  into  the  field  of  object  and  target 
recognition. 

(8)  Through  contributions  (6)  and  (7),  identification  of  a  practiced  approach  for  obtaining 
a  Cramer-Rao  lower  bound  for  dynamic  object  and  target  recognition  algorithms. 

The  remainder  of  this  dissertation  will  demonstrate  and  elaborate  upon  these  devel¬ 
opments.  Chapter  VI  proposes  extensions  to  the  author’s  efforts. 
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IV.  Residual  Analysis  for  Sensor  Fusion  and  Object  Recognition 


4-1  Introduction 

In  this  chapter,  we  will  propose  a  kinematic/aspect-angle  tracking  filter  design  for 
aircraft  target  recognition,  rind  use  it  for  residual/state  analysis  and  Bayesian  parameter 
estimation  methods,  as  discussed  in  Sect.  2. 3. 1.3.  The  purpose  of  this  effort  is  to  demon¬ 
strate  “Step  One”  as  proposed  in  the  introduction  to  Chapter  III  -  the  use  of  multiple 
model/residual  sequence  analysis  approaches  for  recognition  of  objects  having  “coupled” 
state  dynamics  models,  using  kinematic  and  feature  observable  measurements,  without 
explicitly  considering  residuals  in  the  feature  observable  space. 

The  efforts  in  this  chapter  were  inspired  originally  by  the  simple  observation  that  the 
performance  of  Kendrick  et  al.  and  Andrisani  et  al.  trackers  (see  Sect.  2.3. 3.1)  must  be 
very  sensitive  to  proper  choice  of  target  model.  Using  the  classical  residual  sequence  anal¬ 
ysis  approach,  we  will  exploit  that  sensitivity  by  noting  that  when  a  particular  association 
of  (1)  measurements  from  an  unclassified  target  and  (2)  filter  target  model  fails  to  pro¬ 
vide  expected  residual  and/or  state  behavior,  that  association  is  suspect.  Conversely,  the 
association  that  exhibits  the  “best”  residual/state  behavior  may  be  taken  to  indicate  the 
correct  target  class.  Finally,  information  of  this  kind  from  many  different  measurement 
sources  and  particular  states  can  be  treated  using  Bayes’  Rule  for  proper  probabilistic 
“weighting”  to  obtain  a  maximum  a  posteriori  estimate  of  target  class. 

The  proposed  filter  design  is  by  no  means  uniquely  suited  to  this  application  -  the 
distinguishing  attribute  of  the  filter  shown  here  is  extreme  simplicity.  It  is  designed  more 
to  fail  well  when  it  should  fail  than  to  track  well  in  a  variety  of  situations  -  the  key  point 
is  that  robustness  to  incorrect  target  model  choices  is  not  a  virtue  in  this  application. 

4-2  A  Tracking  Filter  for  Target  Recognition 

4-2.1  Design  Philosophy.  In  App.  C,  the  reader  will  observe  that  the  Kendrick 
et  al.  and  Andrisani  et  al.  kinematic/aspect  filter  designs  require  a  reasonable  amount 
of  target  class-specific  information  that  can  only  be  obtained  by  flight  control  analysis  or 
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empirical  identification  based  on  observing  the  respective  target  classes  in  flight.  For  this 
effort,  this  level  of  information  was  neither  available  nor  required. 

Consistent  with  the  approach  in  the  previous  chapters,  we  will  restrict  this  filter  to 
operate  under  the  assumption  that  the  target  flies  in  a  straight  (not  necessarily  level)  path 
or  in  a  coordinated  turn  with  constant  acceleration  relative  to  the  target  body  frame.  For 
the  short  periods  of  time  (several  seconds)  over  which  we  intend  to  use  this  tracker,  these 
are  considered  to  be  valid  assumptions  -  in  any  case,  we  will  discuss  actions  to  be  taken 
in  the  event  that  these  assumptions  appear  to  fail.  The  target’s  key  turn-defining  states 
will  be  modelled  as  constants  (“driven”,  or  influenced  to  change  during  propagation,  only 
by  small  amounts  of  “continuous”  white  noise  in  the  usual  manner  described  in  Sect.  2.3), 
and  assumed  to  be  unknown  at  the  start  of  the  tracking  sequence  -  the  filter’s  first  task 
after  initiation  is  to  converge  to  reasonable  estimates  for  these  states.  These  key  turn- 
defining  states  are  the  roll  angle  and  angle  of  attack  for  the  turn,  relating  in  the  classical 
nonlinear  fashion  to  target  velocity  and  position  relative  to  the  inertial  frame  (as  discussed 
in  Sect.  5.5.3). 

The  “confusion”  experienced  by  this  filter  when  true  and  filter  target  models  are 
mismatched  will  be  due  primarily  to  significantly  incorrect  target  rotation  state  estimates, 
driven  by  aspect  angle  pseudo-measurements  given  to  the  filter.  Recall  that  the  term 
pseudo  applies  because  these  quantities  come  from  a  pose  estimator,  rather  than  directly 
from  a  sensor  per  se.  As  in  the  Kendrick  and  Andrisani  efforts,  the  proposed  filter  will 
place  a  great  deal  of  reliance  on  these  aspect  angle  pseudo-measurements  in  defining  the 
target  acceleration  state  estimate.  Ultimately,  for  the  wrong  target  model  choice,  the 
mismatch  between  (1)  estimated  acceleration  (driven  by  the  incorrect  pose  estimate)  on 
one  hand,  and  (2)  estimated  velocity  and  position  (driven  by  reasonably  good  doppler 
velocity  and  position  measurements)  on  the  other  hand,  will  cause  significant  residuals 
and/or  unreasonable  values  for  some  states  -  betraying  this  incorrect  model  choice  in  the 
classical  fashion  discussed  in  Sect.  2. 3. 1.3. 

We  will  now  discuss  how  these  mismatches  can  arise  in  practice,  and  how  they  affect 
filter  processes.  Note  Fig.  4.1  during  this  discussion.  First  of  all,  assume  that  we  are 
tracking  an  unrecognized  aircraft  target  with  radar  and  some  (notional)  imaging  sensor 
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(trial) 


Figure  4.1.  Tracking  /  Recognition  Scenario  for  Discussion 

as  it  turns  from  a  crossing  trajectory  toward  the  sensor  platform,  as  if  to  launch  a  missile 
at  the  platform.  Simply  tracking  the  target  as  a  point  object  using  the  radar  gives  a 
good  estimate  of  target  position  and  velocity  states  -  the  imaging  sensor  will  provide  a 
pose  estimate  (and  could  improve  cross-range  position  and  velocity  estimates  if  desired, 
although  this  is  not  essential  to  the  discussion). 

Finally,  assume  that  the  true  target  is  a  MIG-21,  but  that  one  of  our  target  recog¬ 
nition  alternatives  requires  us  to  try  associating  the  target  image  with  that  of  an  F-4. 
This  choice  of  candidate  targets  is  not  accidental  -  in  image  feature  observable  domains 
alone,  even  conditioned  to  some  extent  on  kinematics  (e.g.,  limiting  aspect  angle  search 
windows  according  to  observed  motion),  these  two  aircraft  classes  can  be  confused  eas¬ 
ily  [228:77].  We  will  see,  however,  that  in  the  kinematic  state  domain,  conditioned  on 
image  information,  they  are  readily  distinguishable  under  common  conditions. 

Note  that  the  F-4  is  quite  a  bit  longer  than  the  MIG.  Since  range  to  the  target  is 
reasonably  well  known,  our  notional  recognizer  will  presumably  make  use  of  that  length 
difference  (i.e.,  with  reference  to  App.  B.3,  the  recognizer  is  not,  and  should  not  be,  scale- 
invariant).  In  order  to  fit  a  model  or  statistical  library  representation  of  an  F-4  to  the 
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image  of  an  actual  MIG,  the  pose  estimator  will  in  effect  “pitch  up”  (or  down)  the  F-4 
model,  decreasing  its  apparent  length  to  match  the  image  of  the  unclassified  target  (MIG). 
The  problem  now  is  that  the  resulting  pose  estimate,  delivered  to  the  kinematic/aspect 
tracker,  will  imply  an  extremely  large  angle  of  attack  for  the  “F-4”.  This  further  implies  a 
much  higher  acceleration  magnitude  than  the  target  is  actually  executing.  Over  time,  this 
unreasonable  acceleration  estimate  will  cause  inconsistency  within  the  filter,  as  position 
and  velocity  estimates  develop  based  on  kinematic  measurements. 

Moreover,  the  wingspan  of  the  F-4  is  quite  a  bit  larger  than  that  of  the  MIG.  To 
resolve  this  difference,  the  pose  estimator  will  presumably  in  effect  “roll”  the  F-4  model, 
decreasing  its  apparent  width  to  match  the  image  of  the  unclassified  target.  This  error  in 
the  pose  estimate  will  imply  that  the  plane  of  the  wings  lies  at  a  considerably  different 
orientation  than  does  the  true  orientation,  which  will  lead  to  large  errors  in  the  estimated 
direction  of  the  target  acceleration.  This  factor  too  will  lead  to  inconsistency  within  the 
filter,  as  we  shall  see. 

Generically,  the  pose  estimate  errors  described  thus  far  are  “bias”  errors.  Other 
sensors  -  high  range  resolution  (HRR)  radar  and  so  on  -  should  also  be  expected  to 
provide  pose  estimates  exhibiting  biases  due  to  incorrect  model  choices.  For  example,  the 
Mahalanobis  HRR  metric  used  in  Chapter  V  is  extremely  sensitive  to  length  (in  range) 
of  the  HRR  signature  -  as  in  the  previous  example,  a  pose  estimator  using  this  metric 
can  be  expected  to  “turn”  a  long  library  model  to  find  a  best  fit  for  the  (short)  signature 
from  a  smaller  actual  target  (note  that  the  words  “long”  and  “short”  are  italicized  because 
HRR  signature  “length”  is  a  function  of  multi-bounce  and  other  effects,  as  well  as  target 
physical  size,  as  discussed  in  Sect.  2.2.3). 

Other  forms  of  errors  may  be  expected  from  any  pose  estimator,  like  erratic  varia¬ 
tions  between  measurements  (more  akin  to  “white”  measurement  noise  errors),  or  time- 
correlated  errors  (e.g.,  relatively  slow  wandering  in  aspect  angle,  which  can  be  modelled  as 
the  outputs  of  integrators  driven  by  white  Gaussian  noise).  These  “classical”  error  types 
are  described  in  [153:183].  For  radar-derived  pose  estimates  in  particular,  we  might  expect 
“whitc”-type  errors  due  to  scatterer  motion  that  would  affect  pose  estimates  for  correct  or 
incorrect  model  choices. 
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Bias  errors  due  to  incorrect  target  model  choices  and  white  noise  from  various  sources 
are  considered  to  be  well-representative  of  typical  errors  expected  in  actual  target  recog¬ 
nition  scenarios.  Moreover,  from  a  theoretical  perspective,  bias  and  white  errors  tend  to 
“bound”  the  different  forms  of  error  in  general.  For  these  reasons,  these  classes  of  errors 
will  be  the  only  ones  considered  in  this  demonstration. 

4-2.2  The  Filter  State  Model.  The  state  dynamics  model  for  the  proposed  ex¬ 
tended  Kalman  filter  is  given  in  the  equation: 
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(4.1) 


Pt/in,B,D  =  position  of  the  target  in  inertial  frame  coordinates,  i.e.,  with  components 
taken  along  the  North,  Fast,  or  Down  axes  of  an  earth-surface  inertial  or  navigation  frame. 
Dot  notation  denotes  time  rate  of  change  of  the  indicated  variable,  as  observed  from  and 
coordinatized  in  the  indicated  frame. 


vt/i, v.b.d  —  velocity  of  the  target  relative  to  the  inertial  frame  in  inertial  frame  coor¬ 
dinates,  i.e.,  as  observed  from  and  coordinatized  in  the  navigation  (inertial)  frame. 
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Figure  4.2.  Distinction  Between  ab  and  ac 

ab  =  the  angle  between  the  zero-lift  longitudinal  axis  of  the  target  and  the  target 
body  longitudinal  reference  axis  used  by  the  target  recognition  algorithm.  This  angle 
is  assumed  to  be  constant  for  any  given  encounter,  a  function  of  the  aircraft  structure 
and  trim  conditions,  and  all  of  the  factors  that  affect  trim  conditions  (fuel  status,  stores, 
etc.).  The  degree  to  which  this  quantity  is  considered  “known”  or  not  by  the  filter  can  be 
varied  by  adjusting  the  state’s  initial  filter  covariance.  Unlike  the  following  two  variables 
ac  and  agm,  which  enter  into  lift  force  equations,  ab  does  not,  thereby  mitigating  what 
would  otherwise  be  a  serious  observability  problem,  particularly  between  ab  and  ac.  The 
difference  between  ab  and  ac  is  shown  in  Fig.  4.2. 

ac  =  the  angle  between  the  zero-lift  longitudinal  axis  of  the  target  and  the  angle 
of  attack  required  to  achieve  the  desired  (pilot-commanded)  lift.  This  angle  is  treated 
as  a  constant  over  the  period  of  a  few  seconds  required  for  this  algorithm  to  function, 
reflecting  the  fact  that  aircraft  turns  are  generally  held  at  a  near-constant  turn  rate  for 
several  seconds.  As  discussed  below,  this  quantity  is  treated  as  unknown  initially  (i.e., 
this  state  has  a  high  initial  filter  covariance),  and  can  be  “reset”  to  unknown  (by  again 
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artificially  increasing  the  state’s  filter  covariance)  when  filter  residuals  indicate  a  change 
in  turn  state  for  all  models  -  this  is  an  online  adaptive  estimation  approach. 

agm  =  the  angular  difference  between  the  nominal  angle  of  attack  ac  required  to 
achieve  the  desired  maneuver  (turn)  and  the  actual  angle  of  attack,  modelling  small  per¬ 
turbations  about  the  nominal  that  tend  to  null  to  zero  quickly.  (It  is  modelled  as  the 
output  of  a  first-order  Gauss-Markov  process  -  hence  the  subscript  “gm”.) 

pc  =  the  roll  angle  between  the  vertical  “wings-level”  attitude  and  the  angle  required 
to  attain  the  desired  orientation  of  the  plane  of  the  wings,  as  shown  in  Fig.  5.24.  Treated 
as  a  constant  for  the  same  reason,  and  in  the  same  way,  as  ac,  noted  above. 

pgm  =  the  angular  difference  between  the  roll  angle  required  to  achieve  the  desired 
turn  direction  and  the  actual  roll  angle,  modelling  small  perturbations  about  the  nominal 
that  tend  to  null  to  zero  quickly.  (It  is  modelled  as  a  the  output  of  a  first-order  Gauss- 
Markov  process  -  hence  the  subscript  “gm”.) 

Pgm  =  the  angular  difference  between  the  nominal  zero  sideslip  angle  and  the  actual 
sideslip  angle,  modelling  small  perturbations  about  the  nominal  that  tend  to  null  to  zero 
quickly.  (It  is  modelled  as  a  the  output  of  a  first-order  Gauss-Markov  process  -  hence  the 
subscript  “gm”.) 

g  =  acceleration  due  to  gravity 

Jif  —  aerodynamic  load  factor  normal  to  the  velocity  vector,  or,  equivalently,  accel¬ 
eration  due  to  lift  force,  as  computed  in  Eqn.  (5.1).  It  is  important  to  note  that  the  a  or 
angle  of  attack  argument  used  to  compute  is  the  sum  of  ac  and  agm:  ab,  as  noted  in 
the  definition  of  this  variable  above,  accounts  for  reference  differences  or  trim  conditions 
only,  and  does  not  contribute  to  aerodynamic  lift. 

fcoero  =  a  multiplicative  (scaling)  factor  for  aerodynamic  load  factor  (lift  acceleration) 
fif  which  will  be  estimated  to  account  for  uncertainties  in  the  factors  of  Eqn.  (5.1)  - 
including  primarily  aircraft  mass,  coefficient  of  lift,  wing  surface  area,  and,  to  a  lesser 
extent,  velocity  and  other  factors.  It  is  treated  as  a  constant  in  the  same  way  as  ac,  noted 
above. 
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CJ }  —  the  direction  cosine  matrix  to  convert  ?  vector  in  “lift  frame”  coordinates  to 
one  in  inertial  frame  coordinates.  The  lift  frame  is  found  by  rotating  the  velocity  frame 
(Fig.  "  23)  through  the  roll  angle  (pr  +  pgm)  and  defines  the  orientation  of  the  normal 
load  acceleration  with  respect  to  inertial  space.  The  procedures  for  defining  this  direction 
cosine  matrix  (and  others)  are  found  in  App.  C.  The  two  numbers  associated  with  each 
term  of  this  variable  in  Eqn.  (4.1)  axe  respectively  the  row  and  column  required  for  the 
particular  scalar  element  from  C\j  used  at  that  point. 

r„  =  correlation  time  for  the  first  order  Gauss-Markov  process  modelling  the  behavior 
of  agm.  This  quantity  would  be  estimated  theoretically  or  empirically  for  each  target  class. 

Tp  =  correlation  time  for  the  first  order  Gauss-Markov  process  modelling  the  behavior 
of  Pgm ■  Estimated  theoretically  or  empirically  for  each  target  class. 

Tp  —  correlation  time  for  the  first  order  Gauss-Markov  process  modelling  the  behavior 
of  flgm-  Estimated  theoretically  or  empirically  for  each  target  class. 

a., =  appropriate  continuous  time  (heuristically)  zero- 
mean  white  Gaussian  process  driving  noises,  with  appropriate  strength  Q(t)  as  defined  in 
Sect.  2.3.1  and  discussed  in  Sect.  4.2.4  below. 


and  : 


(4.2) 


These  particular  states  were  chosen  because  they  reflect  the  coupling  between  trans¬ 
lational  and  rotational  dimensions  for  a  conventional  ai-craft  more  simply  and  directly  than 
many,  if  not  all,  other  representations  [120,  121,  5].  A  more  conventional  representation 
for  rotational  states,  such  as  a  set  of  Euler  angles  relative  to  an  inertial  frame,  contains  the 
same  information  but  requires  more  complicated  transformations  to  relate  angular  state 
to  translational  accelerations.  A  shortcoming  in  the  representation  he^e  is  tne  treatment 
of  angular  perturbations  as  Gauss-Markov  processes  -  additional  aircraft-peculiar  param¬ 
eters  as  discussed  in  App.  C  would  allow  better  modelling,  but  that  level  of  detail  was  not 
required  for  this  effort. 
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4-2.3  The  Filter  Measurement  Model.  The  measurement  model  for  the  proposed 
extended  Kalman  filter  is  given  in  Eqns.  (4.3)  and  (4.4)  below.  We  assume  that  measure¬ 
ments  of  sensor-to-target  range,  (pointing)  angle,  and  range  rate  are  generated  in  the  usual 
fashion  by  a  radar  tracker,  and  that  range  and  angle  are  provided  to  the  filter  as  target 
position  in  sensor  frame  coordinates  (assumed  instantaneously  inertial).  We  assume  that 
the  pose  estimator  provides  estimates  of  the  three  target  body  Euler  angles  relative  to  the 
filter  inertial  ( navigation )  frame,  which  are  then  processed  relative  to  the  target  velocity 
vector  estimate  to  yield  pseudo-measurements  of  (1)  angle  of  attack,  (2)  roll  angle  and  (3) 
sideslip  angle  for  input  to  the  filter. 

The  sensor  or  predicted  line  of  sight  (pis)  frame  is  a  right-handed  Cartesian  reference 
frame  defined  by  (1)  the  predicted  sensor-to-target  (boresight)  vector,  (2)  the  perpendicular 
(elevation)  axis  lying  in  a  plane  parallel  to  the  local  horizontal,  and  (3)  the  remaining 
(azimuth)  axis,  pointing  generally  down.  This  relationship  is  illustrated  in  Fig.  4.1.  The 
orientation  of  the  local  horizontal  (i.e.,  the  direction  of  gravity)  and  magnitude  of  gravity 
are  assumed  to  be  perfectly  known.  The  implications  of  this  assumption  of  perfect  gravity 
knowledge  will  be  discussed  at  the  end  of  this  section. 

The  relationship  between  the  (filter  state  or  navigation  frame)  inertial  and  sensor 
frames  is  given  by  the  direction  cosine  matrix  Cf1' .  Procedures  for  defining  Cf1*  are  found 
in  App.  C.  Note  that,  although  this  matrix  is  itself  a  function  of  the  filter  states,  it  is 
assumed  to  be  constant  at  any  measurement  event  (i.e.,  the  sensor  frame  is  impulsively 
updated  -  an  appropriate  assumption  for  this  frame,  which  will  be  artificially  maintained 
in  software,  even  in  actual  implementation). 

As  discussed  in  Sect.  2. 3. 2.1  and  [35],  by  comparison  with  other  alternatives,  this 
sensor  frame  approach  offers  several  advantages.  First,  it  is  inertial  between  impulsive 
updates.  Second,  since  range  and  angle  error  for  our  cases  of  interest  are  largely  a  function 
of  target  extent  independently  along  each  axis  in  the  true  (but  unknown  a  priori)  line-of- 
sight  frame,  and  the  predicted  line-of-sight  frame  (pointing  at  the  predicted  target  location) 
is  the  best  estimate  of  that  true  frame,  it  is  reasonable  to  treat  the  expected  measurement 
error  statistics  in  the  predicted  frame  as  independent  along  each  axis  (or  diagonalized,  in 
matrix  form).  Both  the  inertial  nature  and  independent  error  statistics  of  this  frame  bring 
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computational  savings  in  the  tracker,  with  little  effort  or  computational  penalty  required 
for  impulsive  updates. 


In  computing  true  and  filter-estimated  target  locations,  it  is  important  to  separate 
computations,  in  particular  to  prevent  the  filter  from  using  truth  knowledge,  which  an 
actual  system  could  not  access.  In  this  simulation,  position  measurements  treat  the  true 
target  as  a  point,  adding  independently  generated  Gaussian  noise  samples  to  each  axis  in 
the  true  line-of-sight  frame  (for  the  reason  noted  in  the  previous  paragraph).  The  point  in 
space  defined  by  this  corrupted  location  in  the  true  frame  is  transformed  into  the  predicted 
line-of-sight  frame,  as  it  would  be  “seen”  by  the  sensor.  For  simulation  purposes,  the  true 
frame  is  generated  in  the  same  fashion  as  the  sensor  frame  -  their  only  difference  is  that 
the  true  LOS  frame  points  at  the  true  target  location  (unknown  to  the  filter),  whereas 
the  sensor  frame  points  at  the  best  estimate  for  the  target  location.  The  reference  frame 
drawing  in  Fig.  4.1  applies  generally  to  either  the  true  or  predicted  line-of-sight  frames. 

The  measurement  error  (ellipsoid)  statistics  are  also  maintained  separately  for  use  as 
appropriate  in  the  true  or  filte  modelled  sensor  frames.  As  for  the  position  measurements, 
the  range-rate  measurement  is  generated  relative  to  the  true  line-of-sight  frame,  but  is 
assumed  by  the  filter  to  he  along  the  boresight  axis  in  the  predicted  line-of-sight  frame. 

Finally,  z(t,),  the  measurement  at  time  tif  is  modelled  as  the  sum  of  h[x(<*)]  (nonlin¬ 
ear  form  due  to  the  consideration  of  ownship  states  not  modelled  in  the  filter  state  vector 
x)  and  a  vector  of  discrete  time  zero  mean  white  Gaussian  noise  v(ti),  with  an  appropriate 
covariance  R(L)  (see  Eqn.  (2.15)),  or: 

z(ti)  =  h[x(f,), t,]  +  v(f.)  (4.3) 

where: 
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Pt/aB  <?il,:b(Pt/i  ~  Po/t) 

Pt/a a  C?',:°(p t/i  -  pa/i ) 

Pt/as  ci  (P  */»  —  Po/i) 

h[x(*<)]  =  vt/aB  =  c?,,:6(vt/i  -  Vo/i)  (4.4) 

am  ab  +  ac  +  I 

Pm  Pc  4"  Pgm 

0m  _  .  figm 

and: 

Pt/aB  Ab  —  the  position  components  of  the  target  relative  to  the  sensor  (attacker) 
along  the  Boresight,  Azimuth,  and  Elevation  axes  in  the  sensor  frame,  respectively 

vt/aB  =  the  velocity  component  of  the  target  relative  to  the  sensor  (attacker)  along 
the  boresight  axis  in  the  sensor  frame,  as  provided  by  a  doppler  radar 

am  =  the  aspect  angle  sensor-derived  pseudo-measurement  of  angle  of  attack,  found 
by  comparing  the  Euler  angles  of  the  target  body  frame  relative  to  the  filter  inertial 
frame  with  the  filter-estimated  target  velocity  vector  under  the  coordinated  turn  flight 
assumptions  of  Sect.  5.5.3 

pm  =  the  aspect  angle  sensor-derived  pseudo-measurement  of  roll  angle,  found  as  for 

/ 3m  —  the  aspect  angle  sensor- derived  pseudo-measurement  of  sideslip  angle,  found 
as  for  am 

ci>i«:6,a,e  _  8econd,  and  third  rows  (row  vectors  -  hence  use  of  lower  case  “c”) 

of  the  direction  cosine  matrix  Cf‘ ,  which  takes  vectors  from  the  filter  inertial  frame  into 
the  sensor  (predicted  line-of-sight)  frame,  consisting  of  boresight,  azimuth,  and  elevation 
axes,  as  noted  above 

Po/«  =  position  of  the  sensor  (attacker)  in  the  filter  inertial  frame,  i.e.,  with  compo¬ 
nents  taken  along  the  north,  east,  or  down  axes  of  an  earth-surface  inertial  or  navigation 
frame.  Sensor  position  is  assumed  to  be  “perfectly”  known  (i.e.,  errors  in  target  position 
are  orders  of  magnitude  greater  than  errors  in  ownship  position) 
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va/t  =  velocity  of  the  sensor  (attacker)  relative  to  and  coordinatized  in  the  filter 
inertial  frame,  i.e.,  with  components  taken  along  the  north,  east,  or  down  axes  of  an  earth- 
surface  inertial  or  navigation  frame.  As  for  sensor  position,  sensor  velocity  is  assumed  to 
be  “perfectly”  known 

and  all  other  quantities  are  defined  above. 

The  matrix  of  partial  derivatives  of  hjx(<j)]  with  respect  to  the  states,  H,  is  therefore 
given  by: 


(C?"]3*3 


H  = 


/ipii  ppl< 


[0)3x6 


[0)3x10 

[0]ixio 


1  1 
0  0 
0  0 


1 

0 

0 


0  0  0  0 
110  0 
0  0  10 


7x13 


(4.5) 


where  the  numerical  subscripts  on  Cpl‘  imply  use  of  the  entire  matrix  (3  x  3)  or  a  scalar 
element  from  the  corresponding  row  and  column  (e.g.,  1, 2). 

Small  errors  in  the  knowledge  of  gravity  (much  greater  than  those  typical  for  modern 
navigation  systems)  should  not  in  general  invalidate  this  approach.  Consider  a  five  degree 
error  in  the  direction  of  gravity  (equivalent  to  a  vertical  bias  of  less  than  0.004  g,  and  a 
horizontal  bias  of  less  than  0.09  g)  or  comparable  errors  in  knowledge  of  gravity  magnitude. 
With  reference  to  the  scenario  given  in  Sect.  4.2.1,  we  would  expect  this  error  magnitude 
to  have  negligible  effects  on  scenarios  that  involve  pose  estimates  implying  significant  dif¬ 
ferences  in  acceleration  magnitude  (angle  of  attack),  since  small  angle  of  attack  differences 
for  tactical  aircraft  quickly  lead  to  acceleration  differences  much  larger  than  one  g. 


Where  pose  estimates  lead  to  differences  in  acceleration  orientation  only,  the  effect  of 
gravity  error  could  be  more  significant.  For  example,  where  the  error  in  gravity  direction  is 
comparable  to  the  difference  in  lift  orientation  between  correct  and  incorrect  pose  estimates, 


4-12 


and  the  incorrect  pose  estimate  (target)  appears  to  be  more  properly  oriented  with  respect 
to  the  (incorrect)  direction  of  gravity  than  the  correct  pose  estimate  (target),  this  situation 
could  lead  to  incorrect  identifications  -  the  incorrect  target  would  “fly”  with  lower  residual 
errors  than  the  correct  one.  In  any  case,  with  an  imaging  sensor  and  knowledge  of  range 
as  in  the  scenario  of  Sect.  4.2.1,  differences  in  physical  dimensions  for  typical  targets  of 
interest  should  lead  to  pose  estimate  differences  in  general  much  larger  than  typical  errors 
in  the  knowledge  of  the  direction  of  gravity. 

4-2-4  Filter  Tuning.  No  great  deal  of  filter  tuning  was  required  or  performed 
to  demonstrate  the  behavior  of  interest  in  the  proposed  application.  Dynamics  driving 
(pseudo)  noise  (Q),  Gauss-Markov  state  correlation  times  (r),  and  measurement  noise 
parameters  (R)  are  shown  in  Table  4.1. 

Most  initial  variances  are  left  large  (relative  to  the  square  of  typical  values)  for 
states  assumed  “constant  but  unknown”  to  faciliate  the  filter  converging  to  reasonably 
good  values.  Note  that  the  variance  for  atj  (the  “trim”  component  of  a)  is  not  large, 
implying  that  it  is  reasonably  well  known.  This  is  significant,  as  we  shall  see  later. 

Correlation  times  for  Gauss-Markov  processes  were  set  based  on  reasonable  expec¬ 
tations  for  the  time  required  for  flight  control  systems  to  null  fluctuations  in  these  states. 
Dynamics  driving  noise  strength  values  for  these  Gauss-Markov  angular  states  were  then 
set  using  the  procedures  discussed  in  Sect.  2. 3.2.1  and  [153:178]  (as  for  the  Gauss-Markov 
acceleration  states  in  the  Singer  target  model),  with  reasonable  assumptions  for  expected 
angular  standard  deviations.  Dynamics  driving  noise  values  for  the  assumed-constant  an¬ 
gular  states  are  set  to  arbitrary  small  values. 

Measurement  corruptions  in  this  scenario  are  assumed  to  be  white.  The  measurement 
sampling  interval  is  taken  as  0.1  seconds,  consistent  with  dedicated  tracking,  as  opposed 
to  track-while-scan  operation.  The  following  paragraphs  consider  position,  velocity,  and 
aspect  angle  measurements  in  turn. 

Position  measurements  treat  the  target  as  a  point,  adding  independently  generated 
Gaussian  noise  samples  of  standard  deviation  100  feet  to  each  axis  in  the  true  line-of- 
sight  frame.  The  filter  measurement  model  assumes  Gaussian  noise  of  100  foot  standard 
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Table  4.1.  Filter  Tuning  Parameters 
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deviation  relative  to  the  predicted  line-of-sight  frame,  as  defined  above.  For  spherical  error 
ellipsoids  as  employed  here,  the  difference  between  orientations  of  the  true  and  predicted 
line-of-sight  frames  is  not  important. 

At  100,000  feet  range,  this  position  accuracy  equates  to  an  angular  accuracy  standard 
deviation  of  one  milliradian,  which  is  considered  to  be  state  of  the  art  for  an  aircraft  radar 
system  in  a  dedicated  tracking  mode,  but  very  conservative  with  respect  to  the  performance 
of  state-of-the-art  military  land-  or  ship-based  systems  [15,  16,  11].  Likewise,  assumption 
of  100  foot  standard  deviation  for  range  measurements  is  considered  conservative.  It  is 
important  to  note  that  position  bias  errors  (i.e.,  errors  constant  for  periods  of  several 
seconds  or  more)  will  have  little  effect  on  the  proposed  algorithms.  “White”  and  otherwise 
rapidly  changing  errors,  however,  are  observed  by  the  filter  as  target  accelerations,  and  for 
aircraft  targets  at  least,  imply  large  variations  in  aspect  angle. 

Doppler  velocity  measurements  are  similarly  defined  from  the  target-sensor  relative 
velocity  along  the  true  boresight,  but  are  treated  by  the  filter  as  though  along  the  predicted 
boresight.  Both  true  and  filter  doppler  velocity  standard  deviations  are  60  feet  per  second 
in  Test  Scenarios  1A  and  IB  (defined  in  the  next  section).  This  accuracy  is  conservative 
with  respect  to  published  results  [48].  True  and  filter  doppler  velocity  standard  deviations 
were  increased  to  100  feet  per  second  (i.e.,  approximately  twice  the  accuracy  of  results 
in  [48])  to  reduce  considerably,  but  not  eliminate,  the  impact  of  doppler  information  for 
Scenario  2  -  this  is  the  only  tuning  difference  between  the  three  scenarios. 

Aspect  angle  measurements  are  corrupted  in  all  cases  with  white  noise  of  standard 
deviation  three  degrees,  while  the  filter  assumes  two  degrees.  This  was  an  arbitrary  choice, 
rather  smaller  than  the  five  degree  standard  deviations  used  as  nominals  by  [120],  but 
reflecting  an  ultimate  intent  to  obtain  (through  new  methods  discussed  in  the  previous 
chapter,  and  demonstrated  in  the  following  chapter)  smoother  pose  estimates  than  have 
been  available  in  the  past.  The  increase  in  true  over  filter-modelled  noise  is  simply  intended 
to  stress  the  filter  somewhat.  Bias  error  values  are  discussed  in  the  following  section,  since 
they  were  scenario-specific. 
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4-3  Results  and  Discussion 

The  proposed  filter  was  implemented  using  the  “Multimode  Simulation  for  Optimal 
Filter  Evaluation”  (MSOFE)  software  [46],  as  developed  by  Mr.  Stan  Musick  of  Wright 
Laboratory  (WL/AAAS)  et  al.  The  evaluation  was  conducted  primarily  by  feeding  the 
filter  with  artificially  biased  and  white  noise- corrupted  aspect  angle  estimates.  These  cor¬ 
ruptions  are  assumed  to  have  arisen,  for  example,  in  the  manner  as  described  in  Sect.  4.2.1  - 
this  demonstration  does  not  provide  for  an  explicit  simulation  of  the  pose  estimation  pro¬ 
cess.  All  results  are  for  20-run  Monte  Carlo  sets  with  statistics  as  given  in  the  previous 
section. 

These  results  will  show  decisively  that  operating  a  kinematic/aspect-angle  filter  with 
inappropriate  model  assumptions  can  lead  quickly  to  large  residuals  or  deviations  from 
expected  state  values.  The  comparison  of  these  residuals  and  states  of  interest  with  allow¬ 
able  covariance  values  using  Bayes’  Rule  and  multiple  model  algorithms  as  described  in 
Sect.  2. 3. 1.3  and  [154:129-136]  is  a  powerful  approach  for  indicating  the  correct  parame¬ 
ter  set,  or  target  class,  that  generated  the  observed  combination  of  kinematic  and  feature 
observable  behavior.  It  will  be  clear,  moreover,  that  in  many  cases,  a  simple  thresholding 
algorithm  will  quickly  eliminate  some  target  classes  from  consideration.  Another  point  to 
note  is  that  certain  scalar  residual  elements  and  states  may  be  more  indicative  of  incor¬ 
rect  target  class  associations  than  others  -  observing  behavior  in  these  values  individually 
may  be  more  practical  than  observing,  say,  the  ensemble  log  likelihood  of  all  residuals  and 
states  of  interest,  where  differences  can  become  “  blurred”  in  the  fluctuations  common  to 
all  models. 

Most  of  the  the  plots  showing  the  natural  log  of  the  residual/ state  likelihood  for  the 
various  scenarios  are  computed  according  to  Eqn.  (4.6).  Note  that  this  likelihood  is  not  a 
moving  window  sum  or  a  sum  over  an  ever-increasing  time  period,  rather  the  individual 
log  likelihood  value  at  the  particular  time.  The  intent  here  is  to  show  that,  in  the  mean, 
significant  residual  differences  can  be  developed  at  any  one  time.  Two  sets  of  plots  will 
show  how  likelihood  separation  increases  with  the  use  of  a  moving  window  sum,  over  one 
second  (ten  measurement  events,  in  those  cases). 
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Also,  note  that  Eqn.  (4.6)  treats  the  residual  vector  elements  as  independent  from 
each  other  at  any  given  event,  which  is  not  strictly  conventional  -  generally,  the  log  like¬ 
lihood  is  calculated  using  the  weighted  inner  product  of  the  residual  vector  with  itself  at 
time  tf  (prior  to  any  updates),  where  the  weighting  matrix  is  the  inverse  of  the  (clas¬ 
sical)  residual  covariance  matrix  A(tj)  =  [H(fj)P(£~)HT(f,)  +  R(t<)]  (see  Sect.  2. 3. 1.3). 
However,  MSOFE  performs  sequential  scalar  measurement  updates,  and  the  residual  val¬ 
ues  and  corresponding  variances  are  made  available  sequentially  as  scalars.  Although  not 
formally  proven  for  this  effort,  it  is  evident  (as  implied  by  Kailath  [118])  that  residual 
covariance-weighted  distance  in  the  “innovations  space”  is  an  information  measure,  and, 
under  linear  filtering  assumptions,  we  should  not  expect  it  to  change  with  method  or  order 
of  processing.  For  the  case  of  two-dimensional  measurements,  equivalence  is  easily  shown 
between  the  quadratically-weighted  squared  distance  obtained  from  vector  residuals  as 
noted  above,  and  the  squared  distance  obtained  from  sequential  scalar  updates  using  the 
following  equation.  Thus,  the  likelihood  of  interest  is  taken  as: 


Ji(«)  =  -o.»{[E0£ 


1  i  [1-Q  feqero 

)  KZit 


>(*,+)H 

.+)  J 


where: 


L,(ti)  —  the  (natural!  log  likelihood  of  the  observed  residuals  and  state  for  one  (hence 
subscript  “1”)  measurement  event  at  time  tj,  less  the  bias  factor  associated  with  the  log 
likelihood  of  a  Gaussian  probability  density. 

=  the  residual  (squared)  for  the  m-th  measurement  at  time  t{. 

<r^(<i)  =  the  filter-computed  variance  for  the  residual  of  the  m-th  measurement  at 
time  ti . 

Pk.m(tf)  —  the  filter-computed  estimate  of  the  variance  in  the  aerodynamic  scale 
factor  kaeT0  (the  13th  state)  following  measurement  update  at  time  U. 

and  other  variables  are  as  defined  above. 

We  note  in  passing  that  for  nonlinear  filtering,  order  of  computation  can  be  very 
important  -  in  general,  we  often  choose  to  process  the  most  accurate  information  first, 
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in  particular  to  provide  the  most  accurate  start  for  successive  relinearizations.  In  the 
sequence  of  scalar  updates  for  this  effort,  we  process  position  measurements  before  the 
doppler  measurement  -  to  provide  a  best  estimate  of  target  direction  along  which  to  align 
and  process  the  doppler  velocity  information. 

4-3.1  Scenarios  1A  and  IB.  The  following  discussion  refers  to  a  situation  very 
similar  to  that  shown  in  Fig.  4.1  -  a  target  is  turning  toward  us  (the  sensor)  in  a  horizontal 
plane  so  that  the  plane  of  the  target  wings  is  generally  normal  to  the  sensor- to- target  vector. 
This  is  a  scenario  of  great  interest  in  air-to-air  combat,  due  to  the  possibility  that  a  hostile 
target  will  release  a  missile  at  the  completion  of  the  turn.  We  wish  to  determine  the  class 
of  the  unknown  target  engaged  in  this  maneuver. 

Each  of  the  points  of  this  discussion  revolves  around  a  pair  of  figures  -  the  first  figure 
showing  what  a  properly-matched  target  model  would  yield,  and  the  second  figure  showing 
what  a  particular  mismatched  target  model  would  yield.  The  effect  of  the  mismatched 
model  in  his  case  is  to  induce:  (1)  a  20  degree  positive  bias  in  the  roll  angle  pseudo- 
measurement  (i.e.,  the  target  is  apparently  rolled  20  degrees  further  to  the  right  than  the 
actual  roll  angle  required  for  the  given  maneuver)  and  (2)  a  20  degree  positive  bias  in  the 
angle  of  attack  pseudo- measurement.  The  reader  should  note  the  extent  to  which,  in  each 
case,  the  mean  measurement  residual  or  state  estimate  remains  within  the  bounds  of  the 
residual  or  state  filter-computed  standard  deviation,  <Tj  -  the  symmetrically  paired  curves 
represent  zero  (the  filter- anticipated  residual  mean)  +/-  one  crf. 

Figs.  4.3  through  4.10  refer  to  Scenario  1A,  in  which  the  target  turn  rate  equates 
to  1  g  in  the  horizontal  plane  (so  that  the  nominal  roll  angle  is  45  degrees  to  the  local 
horizontal).  For  Figs.  4.11  through  4.14,  or  Scenario  IB,  we  increase  the  horizontal  turn 
acceleration  to  4  g’s,  so  that  the  nominal  orientation  of  the  plane  of  the  wings  is  nearly 
perpendicu’ar  to  the  local  horizontal. 

Figs.  4.3  and  4.4  show  the  behavior  of  the  vertical  position  measurement  (pt/0j) 
residuals  with  correct  and  incorrect  model  assumptions.  The  second  (incorrect  model) 
figure  exhibits  classical  residual  “failure”,  which  may  be  interpreted  as  the  filter  directing 
its  acceleration  estimate  in  the  wrong  direction,  and  being  unable  to  supply  the  lift  required 
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to  maintain  the  filter  model  near  the  altitude  of  the  true  target  -  hence,  these  altitudes 
diverge,  and  the  oivergence  appears  immediately  in  the  residual.  This  behavior  will  be 
observed  with  or  without  the  angle  of  attack  bias.  Note  that  the  plots  look  essentially 
the  same  over  the  first  two  seconds  because  some  time  is  required  for  the  underlying 
acceleration  contradiction  to  develop  an  obvious  position  contradiction  (moreover,  the  same 
noise  realizations  were  used  for  both  plots). 

Figs.  4.5  and  4.6  show  the  behavior  of  the  state  estimate  for  katTQ.  As  noted  above, 
this  state  is  expected  to  have  a  value  of  one.  Errors  are  reasonably  small  for  the  unbi¬ 
ased  angular  pseudo  measurements  -  the  “wandering”  effect  over  time  is  due  to  limited 
observability  between  this  and  other  states  (principally  the  angle  of  attack  states).  For 
biased  angle  of  attack  measurements,  however,  kaeTO  reflects  immediate  deviation  from  its 
nominal  value.  This  result  may  be  interpreted  as  the  filter  telling  the  user  that  it  is  willing 
to  accept  the  observed  position,  velocity,  and  aspect  angle  values,  but  that  the  target  must 
have  many  times  the  nominal  expected  mass,  in  order  to  be  consistent  with  the  observed 
position  and  velocity  measurements  over  time.  This  behavior  will  be  observed  with  or 
without  the  roll  angle  bias. 

Note  that  the  behavior  of  the  katro  estimate  for  the  wrong  model  is  due  to  the  fact 
that  the  initial  variance  for  state  ab,  the  “trim”  angle  of  attack  value,  is  kept  small  and  the 
initial  ab  value  is  zero.  Increasing  the  initial  variance  for  this  state  will  allow  it  to  assume 
significantly  non  ?ero  values,  reducing  the  tendency  for  the  filter  to  modify  kaeTO  according 
to  observed  measurements.  In  that  event,  ab  should  be  added  to  Eqn.  (4.6)  as  a  state  of 
interest  -  looking  for  deviations  from  a  nominal  trim  angle  of  attack.  Alternatively,  “lock¬ 
ing”  kaeTO  to  a  value  near  one  and  ab  to  values  near  zero  will  cause  immediate  deviations  in 
the  range  rate  residual,  if  the  boresight  vector  is  aligned  along  axes  of  motion  affected  by 
these  assumptions.  The  key  point  is  that,  one  way  or  another,  where  kinematics  and  aspect 
angle  are  as  highly  coupled  as  they  are  for  most  aircraft,  incorrect  modelling  assumptions 
will  betray  themselves  in  a  kinematic/aspect-angle  filter. 

Next,  Figs.  4.7  and  4.8  contrast  the  behavior  of  the  log  residual/state  likelihood  at 
each  measurement  event  (not  a  sum  over  several  events)  for  the  two  target  model  choices. 
The  somewhat  lower  likelihoods  around  the  one  second  point  in  time  are  due  to  the  angle 
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Z  Sensor  Axis  Residuals  with  Correct  Target  Model:  Scenario  1A 


seconds 


Figure  4.3.  Z  (Vertical)  Sensor  Axis  Position  Residuals,  Correct  Model,  Scenario  1A 


Z  Sensor  Axis  Residuals  with  Wrong  Target  Model:  Scenario  1A 


seconds 


Figure  4.4.  Z  (Vertical)  Sensor  Axis  Position  Residual",  Wrong  Model,  Scenario  1A 
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Non-Dim.  Non-Dim. 


Error  In  K  with  Correct  Target  Model:  Scenario  1 A 


Figure  4.5.  ka„a  Error,  Correct  Model,  Scenario  1A 
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Error  In  K  with  Wrong  Target  Model:  Scenario  1A 
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Figure  4.6.  kaero  Error,  Wrong  Model,  Scenario  1A 
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Log  (Ln)  Likelihood  (Less  Bias  Factor) 


Residual  t  K  Log  Likelihood:  Correct  Model,  Scenario  1A 


Figure  4.7.  Residual  /  State  Log  Likelihoods,  Correct  Model,  Scenario  1A 


Residual  l  K  Log  Likelihood:  Wrong  Model,  Scenario  1 A 


Figure  4.8.  Residual  /  State  Log  Likelihoods,  Wrong  Model,  Scenario  1A 
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of  attack  error,  and  the  recovery  occurs  because  of  the  changed  or  “learned”  value  in  kaeT0. 
The  unreasonable  resulting  value  of  kaeTO ,  however,  is  the  major  factor  in  the  remaining 
difference  between  residual/state  log  likelihood  for  the  correct  and  incorrect  target  models. 

The  reader  should  expect  that  using  partial  or  growing  sums  of  log  likelihoods  over 
time  will  increase  the  distinction  between  likelihoods  for  the  two  cases.  The  effect  of 
summing  the  likelihoods  over  a  moving  window  of  one  second  (ten  measurements)  is  illus¬ 
trated  in  Figs.  4.9  and  4.10.  Note  the  improved  separation  between  correct  and  incorrect 
models  for  this  pair  of  figures,  by  comparison  with  Figs.  4.7  and  4.8.  Treating  these  likeli¬ 
hoods  in  a  two-class  Bayesian  parameter  estimation  case  (Eqn.  (3.2))  with  equal  a  priori 
probability  p(o>3 ) ,  we  quickly  observe  that  since  the  mean  likelihood  for  the  wrong  case 
(a  10~10exp(— 45))  is  many  orders  of  magnitude  smaller  than  that  for  the  correct  case 
(a  10~loexp(— 30)),  the  a  posteriori  probability  for  the  “correct”  (unknown  a  priori )  tar¬ 
get,  in  the  mean,  is  effectively  one  (note  that  the  leading  coefficients  for  the  two  Gaussian 
likelihoods  are  both  on  the  order  of  1.0  X  10-1°). 

Now  we  consider  Scenario  IB.  Figs.  4.11  and  4.12  show  that  higher  turn  rates  speed 
the  development  of  residual  error  in  the  vertical  sensor  position  channel.  Figs.  4.13  and  4.14 
show  that  the  katTO  state  behavior  is  consistent  with  the  earlier  case  (as  are  the  residual 
log  likelihoods,  not  shown  here).  Note  that  kaeTO  state  error  appears  somewhat  smaller  in 
Scenario  IB  than  Scenario  1A  because  of,  and  not  in  spite  of,  the  higher  true  acceleration 
level  in  this  case.  The  proportional  difference  in  mass  implied  by  an  increase  (error)  of  20 
degrees  in  angle  of  attack  when  the  required  angle  of  attack  is  large  (for  the  high-g  turn) 
is  less  than  the  proportional  difference  in  mass  implied  by  an  angle  of  attack  error  of  20 
degrees  when  the  required  angle  of  attack  (turn  rate)  is  smaller. 

It  is  important  to  note  that  the  tuning  parameters  in  these  scenarios  represent  what 
are  for  many  applications  very  worst  case  choices,  both  for  the  filter  and  truth  model.  For 
example,  in  an  air-to-air  scenario  with  a  radar  and  an  imaging  sensor,  even  with  many 
miles  of  range  to  target,  the  random  component  of  position  measurement  error  normal 
to  the  line-of-sight  could  be  considerably  better  than  the  100  foot  one  sigma  errors  used 
here,  if  the  imaging  sensor  aids  in  the  pointing  and  tracking  process.  This  will  reduce  the 
random  noise  in  these  results  considerably,  while  preserving  the  trends  shown  here.  Also 
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Figure  4.9. 


Figure  4.10. 


Residual  /  K  Log  Likelihood:  Correct  Model,  Scenario  1A,  Sum  10 


Residual  /  State  Log  Likelihoods,  Correct  Model,  Scenario  1A,  Moving  Win 
dow  Sum  (one  second,  ten  measurements) 


Residual  /  K  Log  Likelihood:  Wrong  Model,  Scenario  1A,  Sum  10 


Residual  /  State  Log  Likelihoods,  Wrong  Model,  Scenario  1A,  Moving  Win¬ 
dow  Sum  (one  second,  ten  measurements) 
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Z  Sensor  Axis  Residuals  with  Correct  Target  Model:  Scenario  IB 


seconds 


Figure  4.11.  Z  (Vertical)  Sensor  Axis  Position  Residuals,  Correct  Model,  Scenario  IB 


Z  Sensor  Axis  Residuals  with  Wrong  Target  Model:  Scenario  IB 
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seconds 


Figure  4.12.  Z  (Vertical)  Sensor  Axis  Position  Residuals,  Wrong  Model,  Scenario  IB 
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Error  In  K  with  Correct  Target  Model:  Scenario  1 B 
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Figure  4.13.  kaeTO  Error,  Correct  Model,  Scenario  IB 


Error  In  K  with  Wrong  Target  Model:  Scenario  1 B 


Figure  4.14.  kaero  Error,  Wrong  Model,  Scenario  IB 
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recall  that  these  results  are  for  a  case  in  which  the  aspect  single  pseudo-measurements  are 
rather  noisier  than  assumed  by  the  filter  -  this  means  that  the  filter  tends  not  to  suppress 
the  noise  as  well  as  it  could  if  it  were  “told”  that  the  noise  level  is  higher.  Thus,  tuning 
could  improve  these  results  somewhat,  but  the  intent  here  is  to  show  that  precise  tuning 
is  not  required  to  demonstrate  major  differences  between  classes  in  likely  scenarios. 

4-3.2  Scenario  2.  This  scenario  is  very  simple  -  the  target  is  20  miles  out, 
10,000  feet  higher  than  the  sensor,  and  flying  at  800  feet  per  second  straight  and  level  on 
a  bearing  directly  toward  the  sensor.  This  scenario  is  unique  among  the  ones  addressed  in 
this  research  in  that  the  aircraft  isn’t  turning  -  in  this  case,  the  combination  of  conditions 
required  merely  for  level  flight  will  quickly  indicate  an  incorrect  target  class.  Recall  from 
Sect.  4.2.4  that  the  doppler  velocity  information  quality  is  degraded  here  over  that  in  the 
previous  case,  in  particular  to  show  that  high  quality  doppler  isn’t  required  -  the  activity 
of  interest  here  (conflict  in  the  vertical  acceleration  estimate)  occurs  normal  to  the  line-of- 
sight. 

As  before,  we  have  two  candidate  target  classes  -  the  correct  one,  and  an  incorrect 
one.  Again,  the  pose  estimator  gives  an  aspect  angle  measurement  which  equates  to  a 
20  degree  positive  bias  error  in  angle  of  attack.  This  could  occur  for  this  trajectory,  for 
example,  if  a  high  range  resolution  radar  algorithm  were  “allowed”  to  find  the  best  pose 
match  from  a  short  true  target  to  a  long  library  model  -  generally  looking  beyond  the  10 
or  20-degree  square  extent  of  aspect  angle  window  to  which  searches  would  normally  be 
restricted  (bearing  in  mind,  as  noted  in  Sect.  3.8.4,  that  small  aspect  angle  search  windows 
based  on  real  lime  tracking  for  aircraft  targets  are  highly  unrealistic  in  any  case).  Figs.  4.15 
and  4.16  show  that  the  incorrect  model  association  betrays  itself  quickly  in  the  kaero  state 
as  before.  In  general,  it  has  been  noted  that  any  angle  of  attack  bias  with  an  absolute 
value  of  ten  or  more  degrees  causes  comparable  results  for  aircraft  of  the  F-4  /  MIG-21 
performance  range. 

Figs.  4.17  and  4.18  (likelihoods  at  each  measurement)  and  Figs.  4.19  and  4.20  (sum¬ 
ming  likelihoods  over  a  moving  window  of  one  second  or  ten  measurements)  show  that 
residual  differences  are  somewhat  more  pronounced  between  correct  and  incorrect  models 
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Figure  4.15.  kaeTO  Error,  Correct  Model,  Scenario  2 


Error  In  K  with  Wrong  Target  Model:  Scenario  2 


seconds 


Figure  4.16.  kaero  Error,  Wrong  Model,  Scenario  2 
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Residual  /  K  Log  Likelihood:  Correct  Model,  Scenario  2 


Figure  4.17.  Residual  /  State  Log  Likelihoods,  Correct  Model,  Scenario  2 


Residual  /  K  Log  Likelihood:  Wrong  Model,  Scenario  2 


Figure  4.18.  Residual  /  State  Log  Likelihoods,  Wrong  Model,  Scenario  2 
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Figure  4.19. 


Figure  4.20. 


Residual  /  K  Log  Likelihood:  Correct  Model,  Scenario  1A,  Sum  10 


Residual  /  State  Log  Likelihoods,  Correct  Model,  Scenario  2,  Moving  Win 
dow  Sum  (one  second,  ten  measurements) 


Residual  /  K  Log  Likelihood:  Wrong  Model,  Scenario  1A,  Sum  10 


Residual  /  State  Log  Likelihoods,  Wrong  Model,  Scenario  2,  Moving  Win¬ 
dow  Sum  (one  second,  ten  measurements) 
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than  in  the  previous  scenarios.  This  is  due  both  to  the  poorer  doppler  quality  and  differ¬ 
ent  trajectory  here,  compared  with  that  in  the  previous  scenarios.  Restoring  the  better 
doppler  quality  will  improve  these  residuals  somewhat,  but  artificially  providing  the  filter 
with  “vertical”  velocity  measurements  (i.e.,  target  velocity  measurements  along  the  axis  of 
the  vertical  motion  which  would  tend  to  be  induced  by  a  large  angle  of  attack  in  this  case) 
is  observed  to  restore  the  residual  error  to  slightly  smaller  them  that  observed  in  Fig.  4.8  - 
as  expected,  since  Scenario  2  does  not  have  a  roll  angle  measurement  bias.  The  point  here 
is  that  high  quality  measurements  in  a  particular  state  space  dimension  can  retard  the 
development  of  high  error  residuals  in  that  dimension,  even  when  the  model  parameters 
governing  state  behavior  in  that  dimension  are  in  error. 

In  other  words,  high  quality  measurements  can  force  the  state  dynamics  model  to 
follow  observed  behavior,  despite  the  model’s  inclinations  to  the  contrary.  In  that  case, 
residual  behavior  per  semay  not  be  the  best  indicator  of  an  inproper  parameter  set  choice  - 
hence  our  interest  to  consider  indicators  such  as  the  behavior  of  states  like  kaero. 

Summarizing,  the  key  point  in  this  scenario  is  that  the  translational/rotational  cou¬ 
pling  for  conventional  airplanes  is  so  severe  that  relatively  small  pose  estimate  differences 
can  reveal  incorrect  target  class  choices  dramatically  even  in  straight,  level  flight.  The 
conventional  approach  to  target  recognition  by  comparing  distances  in  feature  observable 
spaces  for  various  target  class  choices,  without  considering  dynamic  implications,  is  not 
nearly  as  likely  to  yield  an  output  that  so  forcefully  indicates  incorrect  choices.  This  is 
simply  further  confirmation  of  the  imperative  to  consider  the  joint  likelihood  of  all  observ¬ 
able  events  in  making  object  recognition  decisions. 

4-4  Summary 

The  results  of  this  chapter  provide  clear  confirmation  of  claims  made  in  Chapter  III: 
that  kinematic/aspect  filters  driven  by  correct  and  incorrect  pose  estimates  under  nonre- 
strictive  conditions  can  exhibit  distinctly  different  residual  and  state  likelihoods.  These 
likelihoods  can  be  used  in  a  Bayesian  multiple  model  residual  analysis  parameter  estima¬ 
tor  for  target  recognition  -  without  explicitly  comparing  observed  and  predicted  feature 
observable  measurements  for  correct  and  incorrect  classes. 
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This  application  is  only  one  in  a  class  of  techniques  that  can  be  used  for  dynamic  ob¬ 
ject  recognition  whenever  (1)  two  or  more  disjoint  subsets  of  estimator  state  variables  are 
highly  coupled  by  parameters  that  differ  between  object  classes,  (2)  when  measurements  or 
pseudo-measurements  are  available  that  are  functions  of  state  variables  from  the  disjoint 
subsets,  and  (3)  the  objects  are  observed  under  conditions  that  exploit  state  variable  cou¬ 
pling.  These  techniques  will  consider  the  joint  likelihood  of  measured  and  pseudo-measured 
events,  conditioned  on  (1)  state  coupling  rules  associated  with  each  candidate  parameter 
set,  and  (2)  prior  measurements  and  pseudo-measurements  -  including,  implicitly,  the 
feature  observable  measurements  from  which  pseudo-measurements  were  derived. 

These  results  confirm  a  significant  theoretical  and  practical  contribution  to  the  field 
of  dynamic  object  recognition,  extending  the  proposals  of  Therrien  [211]  to  cases  in  which 
predictor/corrector  methods  are  not  suitable  for  sill  measurement  quantities,  but  in  which 
low  ambiguity  mappings  exist  between  such  ill-defined  measurement  quantities  and  one  or 
more  uniquely  coupled  estimator  states.  The  results  also  demonstrate  in  particular  a  new 
and  powerful  approach  for  aircraft  recognition. 

In  the  following  chapter,  we  demonstrate  methods  for  explicitly  comparing  feature 
observable  measurements  to  feature  observable  libraries  for  known  classes,  under  conditions 
where  predictor/corrector  methods  may  be  infeasible.  In  particular,  v/e  will  be  concerned 
with  combinations  of  rapid,  unpredictable  state  space  transitions  and  nonlinear  measure¬ 
ment  functions  that  combine  to  create  “high  frequency”  measurement  variations  -  a  set  of 
conditions  that  applies  in  general  for  high  range  resolution  (HRR)  radar  target  recognition. 
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V.  Dynamic  Programming  Sequence  Comparison  for  Target  Recognition 

5. 1  Introduction 

^his  chapter  will  expand  upon  the  discussion  in  Chapters  I  and  III  to  demonstrate 
the  proposed  concepts  of  dynamic  programming-based  sequence  comparison  -  “motion 
warping”  -  for  sensor  fusion  and  object  recognition.  This  is  the  second  of  three  steps  for 
considering  the  joint  likelihood  of  feature  observations  and  kinematic  measurements,  as 
discussed  in  Sect.  3.1.  The  objective  here  is  to  define  and  investigate  a  class  of  algorithms 
that  combines  feature  observable  information  while  obeying  feasible  or  observed  kinematic 
constraints  for  the  objects  of  interest.  We  will  find  that  restricting  feature  observable 
associations  to  kinematically  likely  sets  achieves  the  result  predicted  in  Sect.  2.6.2  -  that 
is,  reducing  the  output  likelihood  value  for  incorrect  associations,  without  substantially 
affecting  the  output  likelihood  values  for  correct  associations. 

As  in  the  previous  chapters,  the  primary  interest  here  is  recognition  of  aircraft  targets, 
and  the  terminology  will  reflect  that  interest.  Again,  however,  these  basic  approaches  are 
suitable  for  application  to  many  different  recognition  tasks. 

5.2  Concept  Overview 

The  process  of  motion  warping  as  demonstrated  in  this  chapter  is  briefly  summarized 
in  the  following  steps,  to  provide  a  basis  for  understanding  the  more  technical  discussion 
in  the  sections  that  follow. 

5-2.1  Tracking  the  Target.  This  process  can  use  in  general  any  of  the  recognized 
mathematical  models  for  target  tracking.  The  research  described  in  this  chapter  uses  the 
standard  extended  Kalman  filter  /  Singer  model  kinematic  tracking  approach  as  described 
in  Sect.  2.3. 2.1,  followed  by  optimal  fixed  lag  smoothing  and  polynomial  curve  fitting  to 
develop  trajectory  information  suitable  for  estimating  acceleration  of  an  aircraft  target  in 
a  turn.  Smoothing  and  polynomial  curve  fitting  are  covered  in  detail  in  App.  C.6. 

5-2.2  Defining  Likely  Aspect  Angle  Regions  and  Paths.  Given  an  estimate  for 
the  target  position,  velocity,  and  acceleration  over  some  time  period  of  interest,  for  most 
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target  classes  of  interest  we  can  develop  estimates  for  the  most  likely,  or  nominal  aspect 
angle  path  traced  by  the  target-sensor  vector  on  the  hypothetical  aspect  angle  sphere,  as 
discussed  in  Sect.  2.3. 3.1.  Knowledge  of  statistics  for  the  error  in  target  and  ownship  state 
will  allow  us  to  define  aspect  angle  error  bounds  around  the  nominal  aspect  angle  path. 

5.2.3  Matching  Feature  Observations  to  Aspect  Angles.  At  this  point,  we  have 
defined  aspect  angle  regions  on  target  models  of  interest  and  nominal  paths  through  those 
regions,  i.e.,  paths  kinematically  likely  to  have  been  traced  as  measured  feature  observable 
sequences  were  generated.  Target  sensor  signature  modeling  tools  are  used  to  provide  sig¬ 
nature  libraries  corresponding  to  required  models  and  aspect  angles.  Prom  one  of  these 
models,  we  can  select  a  feature  observable  sequence  which  will  be  corrupted  with  noise  to 
produce  the  “true”  sequence.  Now  we  use  any  of  several  dynamic  programming  sequence 
comparison  techniques  (as  discussed  in  Sect.  2.4)  or  other  methods  to  match  feature  ob¬ 
servables  with  likely  aspect  angle  states  for  each  feasible  target  model. 

5.2.4  Evaluating  Performance.  We  will  see  that  the  performance  results  of 
dynamic  programming  sequence  comparison  can  be  expressed  in  terms  of  generalized  am¬ 
biguity  functions,  as  well  as  in  more  conventional  terms  such  as  probability  of  correct  or 
incorrect  recognition. 

5.3  Generation  of  Simulated  Target  Signatures 

5.3.1  Introduction.  As  the  author  began  to  develop  dynamic  programming  (DP)- 
based  sequence  comparison  for  moving  object  recognition,  Mr.  Michael  Bryant  and  Mr. 
Rick  Mitchell  of  the  Target  Recognition  Technology  Branch  (WL/AARA)  at  the  USAF’s 
Wright  Laboratory  (USAF  sponsor  for  this  effort)  requested  that  the  technique  be  applied 
to  aircraft  recognition  using  high-range  resolution  (HRR)  radar.  This  choice  of  sensor 
signature  domain  was  influenced  primarily  by  the  availability  of  computer  models  for  sig¬ 
nature  generation.  The  fact  that  dynamic  time  warping  (DTW)  for  speech  is  notably 
robust  with  regard  to  choices  of  feature  space  and  metric  was  also  reassuring. 
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Thus,  HRR  radar  target  signature  libraries  were  a  fundamental  requirement  for  pur¬ 
suing  this  research.  Achieving  high  fidelity  with  respect  to  real-world  signatures  from 
actual  targets  of  interest,  however,  was  not  a  requirement.  Recall  that  the  intent  here 
is  not  to  establish  performance  absolutes,  but  simply  to  demonstrate  reduced  ambiguity 
in  object  recognition  using  sequence  comparison  techniques,  for  some  choice  of  signature 
and  metric,  relative  to  more  conventional  approaches.  That  being  said,  the  effort  would 
clearly  be  more  interesting  if  some  connection  is  made  between  the  research  targets  and 
real  targets,  and  research  metrics  and  real  metrics.  The  chosen  approach  to  radar  signature 
generation  reflects  this  compromise. 

Some  measured  signature  data  was  available  [166,  20],  but  the  author  made  only 
limited  use  of  this  data  for  several  reasons.  First,  most  measured  data  is  derived  from 
classified  research,  and,  even  where  the  data  itself  is  unclassified,  much  information  of 
interest  about  the  data  remains  classified  (particularly  associations  between  the  data  and 
the  actual  target  classes).  This  dissertation  is  required  to  be  unclassified. 

Second,  the  available  measured  data  is  limited  with  respect  to  aspect  angles  and 
polarizations  ( polarization ,  as  discussed  in  Sect.  2.2.3,  refers  to  the  orientation  of  the 
incident  radar  waveform  in  roll  angle  around  the  sensor-to-target  vector,  with  respect  to 
the  target).  Most  available  aircraft  signature  data  tends  to  be  taken  from  aspect  angles 
near  the  plane  of  the  target’s  wings.  For  our  purposes,  this  region  tends  to  be  of  little 
interest,  since  a  turning  aircraft  target  at  any  altitude  reasonably  near  that  of  the  sensor 
quickly  exhibits  aspect  angles  that  are  far  from  the  plane  of  the  wings.  Use  of  measured 
data,  then,  would  have  severely  limited  the  possild-  i  get  trajectories. 

Third,  measured  data  is  available  for  only  1  «  distinct  target  classes.  For  our 
purposes,  we  will  desire  to  define  pseudo- targets,  in  some  sense  between  real  target  classes 
of  interest.  By  definition,  measured  data  will  not  provide  signatures  for  these  targets. 
Other  problems  with  measured  data  from  recent  tests  aside  from  those  mentioned  here  are 
discussed  at  length  in  [20]  (classified  secret). 

For  these  reasons,  the  author  and  the  research  sponsors  (WL/AARA)  determined 
that  synthetic  signature  generation  was  the  proper  approach  for  this  research.  The  ini- 
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tial  choice  was  the  Syracuse  Research  Corporation  radar  signature  simulation  “SRCRCS” 
developed  under  contract  to  the  USAF’s  Phillips  Lab  at  Rome,  N.Y.  [53].  This  tool  was 
familiar  both  to  AFIT  and  WL/AARA,  and  was  hosted  on  computer  systems  at  both 
locations. 

SRCRCS  has  two  significant  shortcomings  that  limited  its  utility,  however.  First, 
it  is  written  to  run  on  a  VMS  operating  system,  and  without  further  effort  would  not 
be  compatible  on  UNIX-based  Sim  SPARC  workstations,  where  the  bulk  of  this  research 
was  to  be  accomplished.  Second,  it  is  deterministic  -  that  is,  the  particular  target  model, 
center  frequency,  bandwidth,  aspect  angle,  and  polarization  value  completely  define  the 
signature.  As  we  have  seen,  for  reed  targets,  even  when  these  factors  are  constant,  the 
signature  will  vary  randomly  over  time,  due  to  body  vibrations  and  other  factors. 

The  first  of  these  problems  was  overcome  by  using  the  MIT/Lincoln  Laboratory 
RCSTooLLBox  radar  signature  simulation  tool  [37].  Other  tools,  some  of  which  have  much 
higher  fidelity  than  RCSTooLLBox,  were  judged  to  be  less  mature,  more  complex,  or  more 
computationally  demanding  than  the  objectives  and  timelines  for  this  research  required. 
The  second  problem  was  overcome  by  adding  noise  of  realistic  statistics  to  deterministic 
signatures.  The  following  sections  describe  these  actions  in  detail. 

5.3.2  Signature  Simulators:  RCSTooLLBox  and  Others.  RCSTooLLBox  [37] 
is  a  derivative  of  SRCRCS,  developed  by  Dr.  E.C.  Burt  (formerly  of  Syracuse  Research 
Corporation)  and  others  at  MIT  Lincoln  Laboratory.  RCSTooLLBox  is  designed  to  run 
on  UNIX  operating  systems  -  primarily  Silicon  Graphics  systems,  which  offer  excellent 
graphic  visualization  capabilities.  Dr.  Burt  provided  the  author  with  a  version  optimized 
to  operate  on  Sun  SPARC  workstations,  albeit  without  full  graphics  capability. 

Other  than  operating  system,  from  this  user’s  perspective  the  only  significant  dif¬ 
ference  between  SRCRCS  and  RCSTooLLBox  is  that  RCSTooLLBox  models  only  perfect 
conductors,  rather  than  the  full  range  of  conductivities  modeled  by  SRCRCS.  In  both 
cases,  targets  are  modelled  as  three-dimensional  combinations  of  polyellipsoids  (heuristi- 
cally,  “warped  cylinders”),  triangular  plates,  and  point  scatterers  that  physically  resemble 
desired  targets  to  any  desired  degree.  The  radar  scattering  process  is  modelled  using  ge- 
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ometric  optics  and  physical  diffraction  assumptions.  A  particularly  useful  fact  about  the 
relationship  beteween  SRCRCS  and  RCSTooLLBox  is  that  the  detailed  but  user-friendly 
operators’  manuals  for  SRCRCS  [53]  apply  in  many  cases  verbatim  to  RCSTooLLBox. 

Multiple  bounces  by  incident  radiation  are  not  modelled  by  SRCRCS  and  RCSTooLL¬ 
Box.  Therefore,  their  output  signatures  do  not  exhibit  in  particular  the  cavity  or  comer 
reflector  effects  that  are  typical  of  real  signatures  from  aircraft  and  other  tactical  targets. 
The  scattering  assumptions  produce  highly  specular,  or  mirror-like  signatures  -  that  is, 
over  most  aspect  angles,  we  receive  generally  less  return  radiation  than  a  real  target  would 
return,  but  a  few  aspect  angles  return  substantially  more.  SRCRCS  and  RCSTooLLBox- 
derived  HRR  radar  signatures  do  exhibit  the  relative  motion  and  aspect  angle  variation 
of  returns  from  major  scatterers  that  Eire  characteristic  of  actual  HRR  radEir  returns,  but 
their  principal  attraction  is  computational  economy  rather  them  precision. 

The  current  primary  modelling  Edtemative  to  SRCRCS  and  RCSTooLLBox  in  use  at 
the  Wright  Laboratory  Target  Recognition  Technology  Branch  (WL/AARA)  is  a  ray  cast¬ 
ing  model  called  XPATCH,  which  uses  geometric  optics  assumptions,  allows  for  multiple 
bounces,  and  employs  highly  detailed  target  models  to  provide  target  signature  estimates 
that  correspond  to  observed  results  much  more  closely  than  SRCRCS  Emd  RCSTooLLBox. 
WL/AARA  expects  XPATCH  to  play  a  key  role  in  generation  of  signature  libraries  for 
on-line  target  recognition  with  HRR  radcir.  On  a  SPARC  terminal,  however,  generat¬ 
ing  one  XPATCH  signature  (i.e.,  for  a  given  target  model,  aspect  angle,  and  polarization 
vedue)  may  require  several  hours,  whereas  an  RCSTooLLBox  signature  for  models  of  the 
complexity  level  used  in  this  research  (shown  below)  tEikes  about  30  seconds.  Since  a 
typical  smalysis  rim  requires  about  2000  signatures  for  each  of  several  models,  generation 
of  XPATCH  signatures  for  this  research  was  considered  to  be  out  of  the  question.  Had 
adequate  XPATCH  signature  libraries  been  avEulable  at  the  stsirt  of  this  research,  how¬ 
ever,  they  would  have  been  used,  and  extensions  to  this  research  should  certainly  employ 
XPATCH  or  comparable  signature  data  if  available. 

As  of  this  writing,  additional  signature  simulation  tools  have  become  available  locally 
that  show  great  promise  for  future  effort  in  multisensor  fusion.  Chief  among  these  is  the 
“BRL-CAD”  Package  [65],  or  Ballistics  Research  Laboratory  Computer  Automated  Design 
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Package,  available  from  the  U.S.  Army  Ballistics  Research  Laboratory  (BRL)  at  Aberdeen 
Proving  Ground,  Maryland.  This  package  includes  exceptionally  detailed  target  modeling 
capability,  and  has  been  augmented  by  a  set  of  radar  signature  calculation  codes.  These 
include  the  Simulated  Radar  IMagery,  or  SRIM  code  by  the  Environmental  Research 
Institute  of  Michigan  (ERIM),  the  SARSIM  codes  by  Northrop,  and  the  TRACK  code 
by  Georgia  Tech  Research  Institute  (GTRI).  The  first  code  is  a  ray  casting  signature 
generator  apparently  similar  to  that  in  XPATCH,  while  the  latter  two  codes  are  apparently 
more  comparable  to  RCSTooLLBox.  The  reader  is  referred  to  Vol.  I  of  [65],  or  equivalently 
to  [64]  for  a  good  discussion  of  these  models  and  the  radar  modelling  tradeoffs  in  BRLCAD, 
which  illuminate  equally  well  the  comparison  of  SRCRCS/RCSTooLLBox  and  XPATCH 
above. 

Most  notably  in  contrast  to  RCSTooLLBox  and  XPATCH,  however,  BRLCAD  mod¬ 
els  the  target  image  in  visual  and  infrared  spectra  -  “multispectral”  capability  is  essential 
in  multisensor  fusion  research.  BRL-CAD  was  not  on-line  at  AFIT  at  the  start  of  this 
research,  but  is  at  present. 

5.3.3  RCSTooLLBox  Models  and  Signatures.  SRCRCS  and  RCSTooLLBox  tar¬ 
get  models  are  defined  in  an  ascii  text  file  format  called  a  “SCAMP’  file.  A  SCAMP  file 
defines  the  collections  of  polyellipsoids,  triangular  plates,  and  point  scatterers  that  define 
a  target  in  three-space,  and  provides  supplemental  information  regarding  the  scattering 
properties  of  these  objects.  Modelling  procedures  for  SRCRCS  and  RCSTooLLBox  are 
described  in  detail  in  [53].  One  particular  point  that  requires  emphasis  is  that  care  must 
be  taken  to  ensure  that  SCAMP  model  surfaces  do  not  intersect,  since  intersections  tend  to 
confuse  the  shadowing  calculation  algorithm.  Fig.  5.1  shows  the  SRCRCS/RCSTooLLBox 
target  body  coordinate  frame  convention,  which  differs  from  the  airframe  convention  used 
in  the  flight  control  community  (to  be  shown  in  Fig.  5.23). 

SRCRCS  and  the  full-graphics  versions  of  RCSTooLLBox  can  render  SCAMP  file 
contents  as  a  visual  image  from  any  aspect  -  if  desired,  causing  surfaces  not  visible  to 
the  radar  to  be  graphically  invisible  also.  Figs.  5.2  through  5.7  show  typical  SCAMP 
models  used  early  in  this  research,  as  rendered  on  SRCRCS  (recall  that  the  version  of 
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Figure  5.1.  The  SRCRCS/RCSTooLLBox  Radar  Target  Coordinate  Frame 

i 
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RCSTooLLBox  used  in  this  research  does  not  provide  graphic  displays).  Note  that  the 
asterisks  in  these  figures  represent  point  scatterers,  and  that  some  surfaces  have  been 
declared  as  “one-sided”  scatterers,  and  are  not  visible  in  this  representation. 

Figs.  5.2  through  5.7  were  attempts  to  make  “faithful”  representations  of  typical 
targets  of  interest.  Analysis  of  target  recognition  performance  algorithms  using  these 
models,  however,  will  show  that  for  the  trajectories,  signature  metric,  and  noise  levels 
employed,  all  algorithms  selected  the  correct  target  100%  of  the  time  (although  sequence 
comparison  methods  did  so  with  a  higher  confidence  level,  or  lower  ambiguity,  as  observed 
using  the  generalized  ambiguity  function).  To  create  more  ambiguous  targets,  then,  the 
author  was  ultimately  driven  to  augment  the  target  models  with  additional  point  scatterers. 
Fig.  5.8  shows  such  a  MIG-21  so  augmented.  Also,  where  overall  target  size  appeared  to 
be  a  key  discriminant,  the  author  scaled  targets  relative  to  one  other  to  reduce  significant 
size  differences.  Only  with  such  targets,  particular  trajectories,  and  high  noise  levels  was 
it  possible  to  generate  cases  in  which  conventional  techniques  would  fail  (as  defined  in 
Sect.  3.11.2),  but  the  proposed  techniques  would  correctly  identify  targets. 

As  discussed  in  Chapter  III,  for  the  purpose  of  this  research  it  was  necessary  to  define 
not  only  target  models  of  interest,  but  also  target  models  (parameter  sets)  in  some  sense 
“in  between”  two  parent  target  models.  These  “interpolated  models”  will  define  specific 
likelihood  functions  for  target  parameters  in  between  the  “endpoint”  or  origin  targets 
of  interest,  and  will  allow  us  to  construct  the  generalized  ambiguity  function  curves  for 
each  basic  form  of  likelihood  function  (see  Sect.  3.11).  The  SCAMP  file  format  allows  a 
straightforward  approach  to  “interpolated”  targets.  Each  parent  was  defined  by  the  same 
number  and  type  of  shapes  and  surfaces,  but  the  locations  occupied  by  these  objects  in 
3-dimensional  space  differed  according  to  the  size  and  shape  of  the  respective  target.  For 
example,  the  two  targets  in  Figs.  5.2  and  5.3  are  defined  by  the  same  number  and  type  of 
shapes  and  surfaces,  with  only  locations  changed.  Similarly,  the  three  targets  in  Figs.  5.4 
through  5.6  are  commonly  defined  in  this  way.  Borrowing  from  the  evolving  language  of 
computer  graphics  [22],  then,  3-D  linear  interpolation  “morphs”  (morphological,  or  shape, 
transformations)  were  performed  to  obtain  new  “points”  in  target  parameter  space,  or  new 
targets  in  some  sense  “between”  the  two  parents. 
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Figure  5.2.  Boeing  B-737 


Figure  5.3.  Yakovlev  YAK-28 
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Figure  5.4.  Mikoyan  MIG-21 


Figure  5.5.  Sukhoi  SU-22 
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Figure  5.6.  McDonnell-Douglas  F-4 


Figure  5.7.  McDonnell-Douglas  F-4  with  stores  (bombs) 
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Figure  5.8.  Mikoyan  MIG-21  with  Extra  Point  Scatterers 

The  morphing  process  is  performed  by  a  modified  version  of  an  original  RCSTooLL- 
Box  subroutine  that  reads  two  SCAMP  parent  files  line-by-line  and  writes  output  SCAMP 
files  for  desired  interpolation  (or  extrapolation)  values.  The  output  SCAMP  morphs  are 
then  post-processed  somewhat  in  the  usual  SCAMP  file  editing  process  to  re-smooth  polyel¬ 
lipsoid  surfaces  and  eliminate  intersections  of  surfaces.  Fig.  5.9  shows  an  F-4  Phantom  II 
and  a  MIG-21  as  parent  targets,  and  a  pseudo-target  defined  by  50%  interpolation  between 
the  parents.  Note  the  difference  in  size  between  the  three  models  -  often  a  key  discriminant, 
where  the  target-sensor  relative  position  can  exploit  it.  It  must  be  emphasized  that  this 
linear  interpolation  was  never  expected  to  translate  into  linear  changes  of  the  likelihood 
function  outputs,  and  it  did  not. 

5.3.4  Test  Data  Analysis  for  Noise  Statistics.  Like  SRCRCS,  RCSTooLLBox 
is  deterministic.  As  noted  in  Sect.  2.2.3,  however,  actual  signatures  exhibit  considerable 
noise,  or  random  variation  with  time,  even  for  constant  aspect  angle.  All  of  the  signature 
comparison  metrics  available  for  this  research  assume  a  random  noise  component  in  the 
signature.  In  particular,  the  Mahalanobis  metric  assumes  Gaussian  signature  statistics, 


Figure  5.9.  F-4,  50%  Morph,  and  MIG-21.  Note  the  difference  in  size  between  the  three 

models  -  often  a  key  discriminant,  where  the  target-sensor  relative  position 
exploits  size  differences. 

and  requires  a  covariance  matrix  for  computation  of  the  weighted  distance  between  two 
signatures.  The  Mahalanobis  metric  was  used  in  all  of  the  proposed  algorithms  to  provide 
a  comparison  baseline,  and  obtaining  some  representation  for  HRR  radar  signature  means 
and  covariances  was  essential. 

Therefore,  an  analysis  of  actual  HRR  radar  test  results  was  conducted  using  test  data 
and  software  provided  by  the  Wright  Laboratory  Target  Recognition  Technology  Branch 
(WL/AARA)  [166,  20].  The  data  were  taken  from  a  ground  station  illuminating  typical 
aircraft  in  flight.  Each  signature  encompassed  hundreds  of  range  bins  over  a  classified 
range  extent,  and  hundreds  of  signatures  were  available  for  areas  of  small  angular  extent 
or  windows  (say  10  X  10  degrees)  on  any  given  test  set.  Signatures  were  coded  as  to 
discrete  aspect  angle  value  within  each  window,  using  aspect  angle  information  recorded 
by  telemetry  on  the  target  aircraft.  Thus,  statistical  information  like  means  and  covariance 
matrices  can  be  developed  for  signatures  within  the  whole  window  or  sub- windows  on  each 
data  file. 

Three  signatures  chosen  at  random  from  a  typical  data  set  of  this  research  were  shown 
in  Fig.  2.1  in  Chapter  II.  These  results  are  from  a  7  X  10  degree  window  over  which  484 
signatures  were  taken.  Each  of  these  signatures  is  actually  the  result  of  summing  returns 
from  several  dozen  individual  pulses  taken  over  a  period  of  much  less  than  one  second. 


The  128  range  bin  amplitude  values  shown  are  found  by  cropping  a  number  of  bins  off 
each  end  of  the  original  signature,  downsampling  the  remaining  bins  to  256  bins  by  taking 
the  maximum  amplitude  in  each  set  of  three  bins,  and  finally  cropping  bins  as  required  at 
either  end  of  the  256  bin  extent  to  center  the  signature  in  the  remaining  128-bin  extent. 
The  length  of  the  signatures  is  of  the  order  of  the  size  of  an  aircraft,  so  that  the  bin  widths 
lie  in  the  range  of  0.05  to  0.5  meters.  The  centering  process  is  performed  essentially  by 
correlating  each  signature  with  the  m  ;an  of  all  previous  signatures. 

These  signatures  are  not  normalized  for  total  energy.  Normalization  has  been  used 
to  remove  signature  differences  due  to  total  energy,  making  the  fundamental  shape  of 
signatures  for  a  single  target  more  consistent  with  changes  in  aspect  angle,  or  attempting  to 
reduce  uncertainties  due  to  imprecisely-known  atmospheric  transmission  coefficients,  radar 
calibration,  etc.  However,  the  normalization  can  also  increase  ambiguity  in  comparing  two 
different  targets.  At  this  time,  it  appears  that  some  form  of  energy  normalization  may  be 
required  in  HRR  radar  target  recognition  [166,  56]. 

The  aspect  angle  window  for  Fig.  2.1  is  positioned  approximately  in  the  plane  of  the 
wings  looking  at  the  rear  of  the  aircraft .  The  leftmost  portion  of  this  figure  corresponds  to 
the  rear  of  the  target,  and  the  figure  extends  in  range  to  the  right  toward  the  front  of  the 
target.  It  appears  that  there  are  three  fairly  strong  and  consistent  scatterers  in  the  area 
of  range  bins  40  to  60.  Since  this  is  a  jet  engine  aircraft,  the  HRR  radar  signature  from 
this  tail-end  aspect  strongly  exhibits  the  effects  of  multiple  bounces  within  the  rear  engine 
cavity  -  the  “trailing”  returns  in  the  area  of  bins  80  to  100  are  likely  due  to  cavity  effects. 

Figures  5.10  through  5.12  show  means  and  variances  for  collections  of  signatures 
over  three  different  aspect  angle  windows  and  three  different  aircraft.  Fig.  2.1  and  5.10 
correspond  to  the  same  signature  set.  Note  that  the  the  three  scatterer  locations  evident 
in  Fig.  2.1  stand  out  “in  the  mean”  on  Fig.  5.10.  Fig.  5.11  shows  means  and  variances 
from  a  set  of  returns  over  a  window  approximately  20  to  30  degrees  off  the  nose  of  the 
target  aircraft,  and  Fig.  5.12  shows  corresponding  results  from  near  nose-on  (all  signatures 
are  taken  close  to  the  plane  of  the  wings).  Since  the  origin  target  classes  are  unknown 
(classified  secret),  it  is  difficult  to  say  precisely  how  the  major  peak  locations  correspond 
to  physical  structures  on  the  targets.  In  Fig.  5.12,  however,  we  can  be  reasonably  certain 
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that  the  first  peak  on  the  left  corresponds  to  the  forward  fairing  or  bulkhead  of  the  aircraft 
frame.  Remember  that  the  forward  fairing  is  of  necessity  transparent  to  radar  waves  if 
the  target  aircraft  is  equipped  with  a  nose-mounted  radar  (in  which  case  the  fairing  is  a 
radome  -  note  that  this  fact  was  considered  in  the  earlier  modelling  efforts,  e.g.,  Fig.  5.7). 
The  next  peak  to  the  right,  then,  may  correspond  to  scatterers  from  the  cockpit  area,  and 
so  on. 

It  is  important  to  restate  and  emphasize  that  the  downsampling  technique  employed 
here  was  to  to  pick  the  maximum,  amplitude  value  from  the  three  values  corresponding  to 
three  contiguous  original  bins.  Other  researchers  have  simply  discarded  two  out  of  three 
bin  amplitudes  -  possibly  discarding  a  high  amplitude  return  in  favor  of  a  null  value  in 
a  neighboring  bin.  It  is  the  belief  of  the  author  and  others  [19]  that  “pick  maximum 
amplitude  from  n  bins”  downsampling  (where  n  =  3  in  this  case)  is  more  likely  to  trap  the 
desired  information.  If  a  “pick  every  nth  amplitude”  approach  is  used  instead,  the  mean 
will  be  lower,  and  the  variance  higher,  just  as  we  would  expect. 

In  any  case,  these  plots  show  clearly  that,  even  though  individual  signatures  are 
highly  noisy,  there  is  a  definite  statistical  consistency  in  the  signatures,  even  over  these 
relatively  large  aspect  angle  windows.  They  also  support  the  conclusion  that  variance  can 
be  treated  as  reasonably  constant  across  the  target  range  extent  for  a  variety  of  targets 
and  aspect  angles.  Clearly,  these  results  do  not  support  the  conclusion  by  some  [56]  that 
HRR  radar  signatures  “completely  decorrelate”  with  small  changes  in  aspect  angle. 

Most  researchers  treat  HRR  radar  signatures  as  statistically  independent  from  bin- 
to-bin.  In  this  analysis,  the  bin  variance  values  were  actually  gathered  as  part  of  the 
full  covariance  matrix  for  each  set  of  HRR  radar  signatures,  and  correlation  coefficients 
were  defined  to  show  inter-bin  statistical  relatonships.  For  the  128-bin  vectors  shown  here, 
correlation  coefficients  were  on  the  order  of  0.7  for  adjacent  bins,  dropping  to  roughly  0.3 
for  separations  of  five  bins,  and  to  roughly  0.2  for  separations  of  ten  bins.  Thus,  there  is 
significant  bin-to-bin  correlation,  but  this  is  to  be  expected  due  to  (1)  physical  relationships 
between  the  scatterers  in  adjacent  range  bins,  (2)  processing  in  the  radar  itself,  and  (3) 
errors  in  signature  alignment  or  range  registration.  Certainly,  some  portion  of  the  variance 
itself  in  these  signatures  arises  from  the  errors  in  range  registration. 
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Figure  5.10. 


Figure  5.11. 


Mean  Radar  Signature,  +/-  One  Std.  Dev.  (7x10  degree  window,  484  sigs  .) 


Measured  Signatures:  Means  and  Variances  -  Rear  Aspect  (test  data  taken 
from  tape  GTwllAtran.dathr,  results  of  [20],  provided  by  [166],  with  further 
processing  by  the  author) 


Mean  Radar  Signature,  +/-  One  Std.  Dev.  (9x10  degree  window,  342  sigs  .) 


Measured  Signatures:  Means  and  Variances  -  Front/Side  Aspect  (test  data 
taken  from  tape  GTw39Atran.dathr,  results  of  [20],  provided  by  [166],  with 
further  processing  by  the  author) 
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Mean  Radar  Signature,  +/-  One  Std.  Dev.  (9x8  degree  window,  458  sigs. ) 


Figure  5.12.  Measured  Signatures:  Means  and  Variances  -  Front  Aspect  (test  data  taken 
from  tape  GTw49Atran.dathr,  results  of  [20],  provided  by  [166],  with  further 
processing  by  the  author) 

Where  the  individual  bin  amplitudes  are  treated  as  Gaussian  and  a  Mahalanobis  (log 
Gaussian  likelihood)  metric  is  employed,  however,  the  assumption  of  bin-to-bin  indepen¬ 
dence  allows  the  Mahalanobis  distance  computation  to  be  reduced  from  a  matrix-vector 
and  vector-vector  (inner  product)  multiplication  to  two  inner  product  multiplications.  As¬ 
sumption  of  constant  variance  across  the  target  extent  allows  further  reduction  to  one  inner 
product  multiplication  and  a  scalar  division.  Since  the  range  registration  process  in  general 
will  require  these  operations  to  be  performed  several  times  to  find  a  maximum  value  for 
any  one  association  of  an  observed  signature  with  a  library  signature,  these  opportunities 
for  reduction  in  on-line  computation  are  important.  In  particular,  as  noted  in  Sect.  2.2.3, 
the  assumptions  of  bin-to-bin  independence  and  constant  variance  allow  this  comparison 
to  be  done  only  once  in  the  frequency  domain  (this  research  was  performed  using  range 
domain  alignment). 

It  is  clear,  however,  that  researchers  generating  noise  samples  for  addition  to  some 
mean  to  provide  a  truth  signature  or  realization  should  not  lightly  assume  bin-to-bin  inde- 
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pendenco.  In  this  research,  the  bin-to-bin  independence  assumption  was  made  consciously 
to  reduce  computation  in  the  simulation  process. 

Other  investigation  not  shown  in  detail  here  supported  the  practice  of  treating  the 
noise  in  any  one  bin  as  Gaussian  in  units  of  decibels  (to  the  extent  that  errors  in  range 
registration  allow  us  to  examine  processes  in  a  single  range  bin),  but  this  does  not  imply 
that  the  HRR  radar  signature  vector  can  be  treated  as  a  multivariate  Gaussian  process. 
Treating  radar  cross  section  as  Gaussian  in  decibels  is  equivalent  to  the  use  of  the  classical 
“lognormal”  statistical  model  for  cross  section  in  units  of  area  [15:114],  but  the  author 
has  found  no  formal  analysis  to  the  effect  that  this  is  a  valid  model  for  HRR  radar  cross 
section.  In  some  sources  (e.g.,  [92]),  no  clear  indication  was  given  as  to  whether  or  not 
radar  cross  section  data  treated  ac  Gaussian  was  in  fact  given  in  units  of  square  area  (i.e., 
proportional  to  signal  power)  or  log  square  area.  Since  it  was  desired  to  model  signal 
statistics  at  least  to  first  order,  this  small  effort  was  expended  to  gain  confidence  in  the  use 
of  dBsm  as  a  unit  of  measure  -  the  intent,  however,  was  only  to  obtain  some  probabilistic 
signature  description,  in  whatever  units. 

Other  issues  of  statistical  interest  not  addressed  here  are  the  degree  of  independence 
of  signatures  over  time,  with  or  without  changing  aspect  angle,  and  the  possibility  of 
multiplicative  error  sources.  Our  research  will  assume  that  the  signatures  are  a  function 
only  of  target  class,  incident  radar  waveform,  aspect  and  polarization  angles,  and  that 
noise  in  each  HRR  radar  signature  bin  is  additive,  independent  from  bin-to-bin,  white 
(independent  in  time)  and  Gaussian  in  dBsm.  Noise  standard  deviations  from  five  to  nine 
dBsm  were  used.  As  shown  in  Figs.  5.10  through  5.12,  standard  deviations  of  five  to  six 
dBsm  are  supported  by  experiment:  higher  levels  were  used  to  increase  ambiguity,  thereby 
accentuating  improvements  from  the  proposed  recognition  methods. 

In  conclusion,  note  that  detailed  statistical  analysis  of  HRR  radar  signatures  was  not 
an  objective  of  this  research  -  this  subject  was  pursued  only  far  enough  to  provide  rea¬ 
sonable  information  for  simulation  studies.  The  author  believes,  however,  that  additional 
research  in  this  area  is  much  needed. 
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5.3.5  Generating  Simulated  Signatures.  The  results  of  the  previous  section  have 
provided  a  general  understanding  of  observed  statistics  for  actual  HRR  radar  tests  on  typ¬ 
ical  aircraft.  This  research  used  these  statistics  in  two  ways:  (1)  to  generate  noise  samples 
for  addition  to  the  deterministic  bin  amplitudes  given  by  RCSTooLLBox,  and  (2)  to  de¬ 
fine  (diagonal)  covariance  matrices  for  use  in  Mahalanobis  metrics.  The  RCSTooLLBox 
signatures  defined  the  “mean”  signatures,  and  the  test  data-defined  noise  covariances  pro¬ 
vided  baseline  statistics  for  noise  generation  and  definition  of  the  Mahalanobis  (covariance) 
weighting  matrix. 

Each  of  the  following  figures  contains  two  curves.  Figures  5.13  through  5.15  contain: 
(1)  the  deterministic  RCSTooLLBox  signature  for  a  given  target  model  (the  F-4  shown 
in  Fig.  5.6),  aspect  angle,  polarization  angle,  center  frequency  (10  GHz),  and  bandwidth 
(1  GHz),  and  (2)  one  noise-corrupted  signature,  given  by  the  sum  of  the  deterministic 
signature  with  a  vector  of  independent  noise  realizations  generated  consistently  with  the 
statistics  defined  above  (a  standard  deviation  of  5  dBsm,  constant  for  all  bins).  The  last  of 
the  signatures  generated  under  these  assumptions,  Fig.  5.16,  is  the  counterpart  of  Fig.  5.14 
for  the  MIG-21  model  shown  in  Fig.  5.4;  that  is,  a  signature  that  would  be  generated 
for  a  MIG-21  flying  the  same  trajectory  (accounting  for  fundamental  differences  in  the 
aerodynamics  of  the  MIG-21  and  F-4).  Radar  center  frequency  and  bandwidth  figures 
were  selected  to  be  consistent  with  parameters  observed  to  be  in  use  by  radar  researchers 
under  contract  to  Wright  Laboratory  [21]. 

These  signatures  are  taken  generally  from  a  nose-on  aspect  angle:  the  forward  part 
of  the  target  corresponds  to  the  left  part  of  the  signature,  and  the  rear  part  of  the  target 
to  the  right  part  of  the  signature.  Each  range  bin  is  approximately  0.225  meters  long.  In 
particular,  the  two  peaks  at  -10  dBsm  just  after  bin  number  40  on  the  F-4  models  are  due 
to  cockpit  scatterers,  and  the  rather  large  return  at  range  bin  80  is  due  to  the  rear  engine 
structure.  In  the  MIG-21  signature,  only  one  cockpit  scatterer  is  present,  and  the  engine 
structure  has  largely  disappeared,  leaving  only  a  small  return  from  the  tail. 

The  reader  will  observe  that  signatures  Figs.  5.14  and  5.16  exhibit  some  similarities, 
but  remain  distinctly  different  -  particularly  in  overall  length  of  the  signature,  owing  to  the 
large  difference  in  length  between  the  F-4  (18  meters)  and  the  MIG-21  (13.5  meters).  The 
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True  and  Noise -Corrupted  Model  Signatures 


Figure  5.13.  Simulated  Signatures:  RCSTooLLBox-Generated  (Solid  Line)  and  Noise- 
Augmented  (Dotted  Line).  F-4  Model,  Fronted  Aspect,  10  GHz  Center  Freq., 
1  GHz  Bandwidth,  5  dBsm  Std.  Dev.  Noise. 


True  and  Noise -Corrupted  Model  Signatures 


Figure  5.14.  Simulated  Signatures:  RCSTooLLBox-Generated  (Solid  Line)  and  Noise- 
Augmented  (Dotted  Line).  F-4  Model,  Frontal  Aspect,  10  GHz  Center  Freq., 
1  GHz  Bandwidth,  5  dBsm  Std.  Dev.  Noise  (displaced  from  signature  in 
Fig.  5.13  by  0.8  sec.,  1.83  deg.). 
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True  and  Noise  Corrupted  Model  Signatures 


Figure  5.15.  Simulated  Signatures:  RCSTooLLBox-Generated  (Solid  Line)  and  Noise- 
Augmented  (Dotted  Line).  F-4  Model,  Frontal  Aspect,  10  GHz  Center  Freq., 
1  GHz  Bandwidth,  5  dBsm  Std.  Dev.  Noise,  (displaced  from  signature  in 
Fig.  5.14  by  0.8  sec.,  1.83  deg.). 


True  and  Noise  Corrupted  Model  Signatures 


Figure  5.16.  Simulated  Signatures:  RCSTooLLBox-Generated  (Solid  Line)  and  Noise- 
Augmented  (Dotted  Line).  MIG-21  Model,  Frontal  Aspect,  10  GHz  Center 
FYeq.,  1  GHz  Bandwidth,  5  dBsm  Std.  Dev.  Noise  (corresponds  for  same 
kinematics  to  F-4  signature  in  Fig.  5.14). 
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Mahalanobis  metric  and  other  HRR  radar  signature  metrics  are  very  sensitive  to  signature 
length,  and,  even  with  large  standard  deviation  noise  added,  conventional  (independent 
look)  algorithms  operating  on  these  simulated  signatures  have  little  trouble  selecting  the 
correct  target  class.  For  real  signatures,  “multibounce”  effects  from  the  engine  intake 
cavities  and  similar  effects  would  make  these  length  differences  less  noticeable,  and  this 
would  increase  ambiguity  considerably. 

Ultimately,  sufficiently  challenging  recognition  tests  were  found  by  defining  scenarios 
satisfying  these  conditions:  (1)  target  models  augmented  with  extra  scatterers;  (2)  models 
seeded  to  similar  size  and/or  viewed  at  aspect  angles  where  target  size  was  not  a  factor; 
(3)  reduced  radar  bandwidth,  so  that  range  resolution  was  reduced;  and  (4)  Gaussian 
signature  noise  of  standard  deviation  eight  or  higher  (i.e.,  rather  greater  standard  deviation 
than  values  observed  in  actual  data).  The  decision  to  reduce  radar  bandwidth  was  made 
after  consultation  with  Mr.  Walter  Barnes  [11]  of  WL/AAR,  who  indicated  that  the  1 
GHz  bandwidth  figures  common  to  recent  research  efforts  might  not  be  practical  in  some 
implementations,  but  that  a  reasonable  simulation  baseline  would  be  approximately  one- 
third  of  the  1  GHz  figure. 

Figures  5.17  through  5.19  are  typical  of  these  “challenging”  signatures.  These  are 
taken  from  an  aspect  angle  in  which  the  plane  of  the  wings  is  nearly  normal  to  the  sensor- 
to-target  or  boresight  vector,  so  that  the  overall  range  extent  of  the  signature  is  minimized. 
Figs.  5.17  and  5.18  are  taken  from  the  MIG-21  target  model  in  Fig.  5.8,  while  Fig.  5.19  is 
taken  from  a  version  of  the  SU-22  as  in  Fig.  5.5,  but  downscaled  20%  to  the  size  of  the 
MIG,  and  equipped  with  scatterers  in  locations  corresponding  to  those  of  the  MIG,  with 
the  same  scatterer  radar  cross  sections.  The  aspect  angle  of  Fig.  5.19  corresponds  exactly 
to  that  of  Fig.  5.17  (identical  aerodynamic  characteristics  are  assumed).  Finally,  consistent 
with  the  comments  in  the  previous  paragraph,  the  radar  bandwidth  for  these  signatures 
was  333  MHz  (10  GHz  center  frequency,  as  above)  and  noise  standard  deviations  are  set 
to  nine  dBsm. 

Comparing  Figs.  2.1  and  5.10  through  5.12  to  Figs.  5.13  through  5.19,  the  reader  will 
observe  that  the  apparent  “noise  floors”,  or  mean  amplitudes  for  range  bins  that  evidently 
correspond  to  free  space  (atmosphere),  are  generally  lower  for  the  simulated  signatures  than 
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True  and  Nase -Corrupted  Model  Signatures 


Range  Bins 


Figure  5.17.  Simulated  Signatures:  RCSTooLLBox-Generated  (Solid  Line)  and  Noise- 
Augmented  (Dotted  Line).  MIG-21  Model,  Extra  Scatterers,  Top  Aspect, 
10  GHz  Center  Freq.,  333  MHz  Bandwidth,  9  dBsm  Std.  Dev.  Noise. 


True  and  Noise-Corrupted  Model  Signatures 


Range  Bins 


Figure  5.18.  Simulated  Signatures:  RCSTooLLBox-Generated  (Solid  Line)  and  Noise- 
Augmented  (Dotted  Line).  MIG-21  Model,  Extra  Scatterers,  Top  Aspect, 
10  GHz  Center  Freq.,  333  MHz  Bandwidth,  9  dBsm  Std.  Dev.  Noise, 
(displaced  from  signature  in  Fig.  5.17  by  0.8  sec.,  3.67  deg.). 
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True  and  Noise-Corrupted  Model  Signatures 


Figure  5.19.  Simulated  Signatures:  RCSTooLLBox- Generated  (Solid  Line)  and  Noise- 
Augmented  (Dotted  Line).  SU-22  Model,  Extra  Scatterers,  Top  Aspect, 
10  GHz  Center  Freq.,  333  MHz  Bandwidth,  9  dBsm  Std.  Dev.  Noise. 
Corresponds  for  same  kinematics  to  MIG-21  signature  in  Fig.  5.17 


for  the  observed  signatures.  This  is  by  choice.  The  deterministic  noise  floors  returned  by 
RCSTooLLBox  for  the  models  shown  above  tend  to  lie  between  -50  and  -80  dBsm.  The 
author  consciously  raised  the  noise  floor  for  simulated  outputs  to  values  from  -30  to  -45 
dBsm  to  provide  a  signal  structure,  or  minimum- to- maximum  amplitude  range  comparable 
to  that  of  the  measurements.  Since  the  peak  simulation  returns  in  general  tend  to  be  lower 
than  those  for  available  test  data  (presumably  due  to  the  specular  model  effects  noted 
in  Sect.  5.3.2),  raising  the  simulation  noise  floor  to  the  test-observed  values  of  -30  to  -35 
dBsm  would  have  left  little  signal  structure  for  comparison. 

For  a  few  aspect  angles,  however,  the  specular  model  associations  will  produce  very 
high  amplitude  returns.  It  should  be  recalled  from  Sect.  2.2.3  that  the  range  sweep  gener¬ 
ating  process  can  be  viewed  as  the  convolution  of  a  sine-shaped  radar  pulse  in  range/time 
(i.e.,  a  rectangle  in  frequency  with  chosen  center  frequency  and  bandwidth)  with  an  array 
of  scatterers  represented  as  impulse  (i.e.,  Dirac  delta)  functions.  At  certain  aspect  angles, 
a  scatterer  may  possess  a  particularly  high  cross-section,  or  impulse  function  magnitude. 
For  example,  consider  a  flat,  perfectly  conducting  plate  normal  to  the  incidence  direction 
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from  a  monostatic  radar  (i.e.,  a  radar  in  which  the  transmit  and  receiver  antennas  are  co¬ 
located).  The  convolution  process,  then,  will  result  in  “range  sidelobes”,  or  “false”  peaks 
in  the  HRR  radar  signature,  due  to  convolution  of  the  high  cross  section  scatterer  with 
side  lobes  of  the  sine-shaped  pulse. 

This  effect  is  observed  in  simulations  and  tests,  but  the  specular  nature  of  RCSTooLL- 
Box  models  mean  than  the  effect  can  be  so  pronounced  for  some  models  and  aspects  (e.g., 
the  F-4  model  in  Fig.  5.6  at  side-on  aspects)  that  the  HRR  radar  signature  is  raised  to  +30 
dBsm  or  more  in  every  range  bin.  Since  the  signature  so  generated  looks  like  a  wall,  the 
author  refers  to  this  effect  colloquially  as  the  “wall”  effect.  Due  to  effects  of  this  nature,  the 
author  inserted  code  to  identify  and  skip  “wall  effect”  signatures,  and  to  limit  or  “clip” 
individual  high  amplitude  peaks.  Clipping  high  amplitude  peaks  also  tends  to  increase 
ambiguity,  which  helps  to  demonstrate  the  proposed  methods,  as  well  as  to  provide  more 
realistic  signatures. 

5.3.6  Summary.  This  section  has  described  the  methodology  by  which  simu¬ 
lated  signatures  were  generated  for  this  research.  The  process  selected,  after  consider¬ 
ation  of  trade-offs  for  model  availability,  time,  and  fidelity,  was  to  generate  signatures 
with  RCSTooLLBox  using  models  that  resemble  tactical  targets  of  interest  to  first  order. 
Measurement  realizations  were  generated  by  adding  zero-mean,  independent  (from  bin- 
to-bin),  white  (independent  in  time,  or  from  signature  to  signature)  Gaussian  noise  to 
the  RCSTooLLBox  “mean”  signatures  for  true  aspect  angles  on  the  model  chosen  to  be 
the  “true”  target.  Noise  statistics  were  fundamentally  consistent  with  those  for  observed 
signatures. 

5-4  Initial  Concept  Demonstration 

5. 4-1  Test  Objectives.  As  a  preliminary  confidence- building  test  of  the  basic 
motion- warping  concept,  an  early  demonstration  was  conducted  using  several  key  tools 
planned  for  use  in  the  proposed  research.  The  objective  of  this  research  was  to  investi¬ 
gate  the  potential  for  applying  dynamic  programming  sequence  comparison  techniques  to 
sequences  of  high  range  resolution  radar  signatures.  Significant  questions  were: 
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(1)  Would  a  classical  sequence  comparison  or  dynamic  time  warping  algorithm  work  at 
all  with  a  high  range  resolution  radar  signature  comparison  metric  and  sequences  of  range 
sweeps? 

(2)  Would  similar  (close  in  an  aspect  angular  sense)  but  non-identical  sequences  from  the 
same  target  produce  small  weirping  path  distances? 

(3)  Would  matches  over  the  same  or  similar  aspect  paths  between  significantly  different 
target  models  produce  significantly  larger  matching  path  distances  than  those  seen  in  the 
preceding  case  (2)? 

The  tools  used  in  the  test  were: 

(1)  The  Syracuse  Research  Corporation  (SRC)  HRR  radar  range  sweep  generator  “SR¬ 
CRCS”  [53]  -  RCSTooLLBox  was  not  yet  available  at  the  time  of  this  initial  set  of  tests. 

(2)  Aircraft  target  models  for  use  with  SRCRCS,  provided  by  Wright  Laboratory  (WL/- 
AARA)  [166]. 

(3)  The  General  Dynamics  “slide  distance”  metric  for  inter-sweep  “distance”  [94],  as  dis¬ 
cussed  in  Sect.  2.2.3. 

(4)  A  simple  time  warping  algorithm  given  in  App.  B  of  [176],  with  modification  by  the 
author  to  reduce  the  cost  penalty  for  warping  path  length.  Specifically,  we  multiply  the 
added  (new  association  point)  cost  for  vertical  and  horizontal  transitions  (see  Fig.  2.8)  by 
a  factor  of  so  that  the  sum  of  added  costs  for  one  vertical  and  one  horizontal  transition 
adds  roughly  the  same  cost  as  one  diagonal  transition,  as  discussed  in  [195]. 

This  test  did  not  require  or  employ  a  target  tracker  or  aspect  angle  estimator  algo¬ 
rithm,  since  such  algorithms  had  been  successfully  tested  by  others  in  the  past  [5,  77,  120, 
121]. 

5-4-2  Test  Procedures.  The  test  was  begun  by  generating  sequences  of  range 
sweeps  in  the  SRCRCS  model  using  the  “RTI”  (Range-Time  Intensity)  option  [53].  This 
option  provides  radar  range  sweeps  over  a  user- specified  angular  extent  (some  portion 
of  a  “great  circle”  on  the  hypothetical  aspect  single  sphere)  at  discrete  angular  values. 
Pairs  of  such  sequences  were  then  processed  to  extract  peaks  and  the  processed  sequences 
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were  warped  against  each  other,  using  the  slide  distance  metric  as  an  inter-sweep  distance 
measure. 

Figs.  5.20  through  5.22  show  three  typical  range  sweep  sequences,  covering  aspect 
angular  extents  of  180,  90,  and  90  degrees  respectively  over  different  paths,  for  aircraft 
models  similar  to  the  F-4  in  Fig.  5.6.  Note  that  each  plot  consists  of  37  “sweeps”  at 
discrete  angle  values,  listed  by  sweep  number  along  the  lower  right  axis.  Each  sweep 
consists  of  radar  cross  section  values  in  128  range  bins,  shown  along  the  bottom  axis.  The 
vertical  axis  shows  radar  cross  section  in  decibel  square  meters  (dBsm).  The  sensor  can 
be  thought  of  as  lying  to  the  extreme  left,  so  that  the  leftmost  returns  are  closest  to  the 
sensor. 

With  reference  to  the  SRC  X-Y-Z  target  coordinate  frame  in  Fig.  5.1,  Figs.  5.20 
through  5.22  represent  respectively  the  following  basic  aspect  angle  paths,  typical  of  those 
used  in  the  test: 

(1)  Fig.  5.20:  a  180  degree  angular  extent  in  the  SRC  X-Y  plane,  centered  on  the  SRC  Y 
axis,  with  37  sweeps  taken  at  5.0  degree  increments.  Thus,  the  angular  extent  of  Fig.  5.20 
lies  in  the  plane  of  the  wings,  moving  in  azimuth  angle  from  the  right  to  left  side  of  the 
target.  The  center  signature,  or  number  19  in  the  sequence  from  1  to  37  as  marked  in  the 
figure,  is  nose  on  to  the  target.  These  signatures  are  taken  from  a  generic  F-4  model. 

(2)  Fig.  5.21:  a  90  degree  angular  extent  in  the  plane  of  the  wings,  centered  on  the  SRC 
X  axis,  with  37  sweeps  taken  at  2.5  degree  increments.  Therefore,  this  angular  extent  lies 
from  the  right  front  (45  degrees  azimuth)  to  right  rear  (135  degrees  azimuth)  of  the  target. 
The  center  signature,  or  number  19  in  the  figure,  is  side  on  to  the  target.  These  signatures 
are  taken  from  a  generic  F-14  model. 

(3)  Fig.  5.22:  a  90  degree  angular  extent  in  the  SRC  X-Z  plane,  centered  on  the  SRC 
X  axis,  with  37  sweeps  taken  at  2.5  degree  increments.  Therefore,  the  90  degree  extent 
covered  by  Fig.  5.22  lies  in  the  plane  which  bisects  and  is  normal  to  the  longitudinal  axis  of 
the  target,  moving  from  -45  to  +45  degrees  in  elevation.  The  center  signature,  or  number 
19  in  the  figure,  is  side  on  to  the  target.  These  signatures  are  taken  from  a  generic  F-16 
model. 
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Range  Bins 

Figure  5.20.  Typical  Range  Sweep  Sequence  -  F-4  model,  180  degree  angular  extent  in 
the  plane  of  the  wings,  moving  from  the  right  to  left  side  of  the  target, 
centered  on  the  target  nose,  with  37  sweeps  taken  at  5.0  degree  increments. 


Range  Bins 

Figure  5.21.  Typical  Range  Sweep  Sequence  -  F-14  model,  90  degree  angular  extent  in 
the  plane  of  the  wings,  moving  from  the  right  front  to  right  rear  of  the  target, 
centered  on  the  right  wing,  with  37  sweeps  taken  at  2.5  degree  increments. 
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Figure  5.22.  Typical  Range  Sweep  Sequence  -  F-16  model,  90  degree  angular  extent 
normal  to  the  plane  of  the  wings,  moving  from  the  lower  right  to  upper 
right  of  the  target,  centered  on  the  right  wing,  with  37  sweeps  taken  at  2.5 
degree  increments. 


These  signatures  were  produced  with  a  center  frequency  and  bandwidth  of  1.25  GHz 
and  1.0  GHz,  respectively  (taken  from  the  example  in  the  SRCRCS  User’s  Manual  [53:18]). 
Note  that  Fig.  5.20  exhibits  the  “wall  effect”  discussed  in  Sect.  5.3.5  -  these  high-return 
signatures  at  sweeps  1  and  37  correspond  to  the  left  and  right  sides  of  an  F-4  model  in  the 
plane  of  the  wings,  where  the  radar  cross  section  is  quite  large. 

Readers  familiar  with  speech  processing  will  observe  the  similarity  between  these 
three  figures  and  frequency  vs.  time  plots  of  human  speech  -  viewed  from  above,  the 
migrating  peak  locations  (heuristically,  “mountain  ranges”)  in  these  plots  bear  an  uncanny 
resemblance  to  formant  tracks  [176:121-123],  or  high  energy  bands  over  time,  in  frequency 
vs.  time  plots  of  speech.  Formants  are  frequently  used  as  features  for  dynamic  time 
warping  in  speech  recognition  [176:297],  and  the  existence  of  similar  structures  in  these 
radar  signatures  (bearing  in  mind  that  true  signatures  have  much  more  noise)  was  a  strong 
inducement  to  continue  research. 

Three  basic  types  of  tests  were  conducted:  (1)  sequences  extracted  from  different  tar¬ 
get  models  over  the  same  aspect  angle  range  were  warped  against  each  other;  (2)  sequences 
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extracted  from  the  same  model,  but  over  slightly  different  paths,  were  warped  together; 
and,  (3)  sequences  from  different  targets  over  slightly  different  aspect  angle  paths  were 
warped  together.  Additionally,  some  clearly  undesirable  matches  were  attempted  as  well, 
matching  two  sequences  taken  from  completely  different  aspect  angle  paths  over  the  same 
or  different  models.  The  tests  did  not  consider  noise. 

5. 4-3  Test  Results.  All  in  all,  the  test  was  very  successful.  The  answer  to  each 
of  the  questions  asked  above  was  yes. 

(1)  Dynamic  “time"  warping  or  classical  sequence  comparison  did  work  with  the  slide 
distance  metric  and  sequences  of  range  sweeps.  The  sequence  expansions  and  compressions 
characteristic  of  warping  processes  were  observed  where  expected  in  these  tests. 

(2)  Similar  (close  in  an  aspect  angular  sense)  but  non-identical  sequences  from  the  same 
target  did  produce  small  warping  path  distances.  Warping  path  distances  between  0  (zero) 
and  60  slide  distance  units  (SDU)  were  generally  observed  for  such  cases. 

(3)  Matches  over  the  same  or  similar  aspect  paths  between  significantly  different  target 
models  did  produce  notably  larger  matching  path  distances  than  those  seen  in  the  preceding 
case  (2).  Warping  path  distances  in  excess  of  150  SDU  were  generally  observed  for  such 
cases. 

Some  potential  but  expected  problem  areas  were  identified.  In  particular,  it  was 
noted  that  for  similar  target  classes  (e.g.,  an  F-14  and  an  F-15),  the  warping  algorithm 
may  provide  a  closer  warping  distance  for  mismatched  aircraft  over  a  given  identical  aspect 
angle  path  than  for  the  same  aircraft  over  slightly  different  (say  10  degrees  “off-nominal 
path”)  aspect  angle  paths.  This  was  an  early  indication  that  “off-nominal  path”  errors 
could  be  significant,  and  that  a  single  “one- dimensional”  sequence  comparison  for  any 
given  target  would  not  be  adequate. 

This  observation  can  be  stated  in  terms  of  expected  ambiguity  function  behavior  (see 
Sect.  3.11.3)  for  movements  or  changes  in  aspect  angle  state  and  target  parameter  spaces. 
It  appears,  for  example,  that  the  generalized  ambiguity  function  for  (1)  comparison  of 
signatures  from  one  path  relative  to  those  from  other  “close”  paths  on  the  same  aircraft 
(i.e.,  changes  in  aspect  angle  state  history)  may  be  more  sharply  peaked  (i.e.,  exhibiting 


5-30 


more  downward  curvature)  than  the  generalized  ambiguity  function  for  (2)  comparison 
of  this  signature  sequence  to  those  from  the  same  path  over  other  (similar  but  different ) 
aircraft  (i.e.,  changes  in  shape  parameters).  Clearly,  this  observation  is  valid  only  for  the 
observables,  distance  metric  and  aspect  angle/aircraft  mismatch  ranges  examined  here. 

Another  problem  was  that  the  conventional  warping  algorithm  used  in  this  test  al¬ 
lowed  no  warp  flexibility  at  the  endpoints  of  the  sequences,  which  were  constrained  to 
match.  For  one  case  in  which  an  F-4  model-derived  -  ,”ence  was  matched  against  a 
slightly  offset  sequence,  a  very  high  matching  cost  was  obtained.  This  was  almost  cer¬ 
tainly  due  to  the  fact  that  the  endpoints  of  one  sequence  were  the  high  amplitude  returns 
from  the  sides  of  the  F-4,  while  the  endpoints  of  the  other  sequence  were  radically  differ¬ 
ent,  and  high  costs  from  this  endpoint  mismatch  could  not  be  avoided.  Ultimately,  then, 
an  “unrestricted  endpoint”  technique  [182],  as  applied  in  speech  recognition  for  words 
with  uncertain  start-  and  endpoints,  was  applied  for  classical  sequence  comparison-type 
techniques  to  overcome  this  problem.  Larson  and  Peschon-type  techniques,  not  forced  to 
match  each  element  along  an  aspect  angle  path,  inherently  have  “unrestricted  endpoint” 
qualities. 

5-4-4  Conclusion.  In  sum,  this  short  test  gave  early  confidence  that  dynamic 
programming  sequence  comparison  techniques  could  be  applied  to  high  range  resolution 
radar  signatures. 

5.5  Detailed  Procedures  for  Motion  Warping 

5.5. 1  Introduction.  This  section  presents  a  detailed  discussion  of  procedures  used 
in  this  research  to  apply  dynamic  programming  sequence  comparison  techniques  in  target 
recognition.  These  algorithms  are  basic  applications  of  “motion  warping”  as  defined  and 
discussed  in  Chapters  I  and  III  -  elements  in  the  second  of  three  classes  of  kinematic- 
feature  observable  fusion  algorithms  for  object  recognition,  as  proposed  in  this  research. 
In  this  section,  we  will  develop  procedures  more  specifically,  leferring  to  previous  sections 
where  needed.  Results  are  presented  in  the  following  Sect.  5.6. 
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5.5.2  Tracking  the  Target.  The  first  step  in  “motion  warping”  is  to  estimate  the 
target  translational  and  rotational  motion,  or  6-DOF  kinematic  state.  In  this  scenario  for 
aircraft  targets,  a  conventional  extended  Kalman  filter  (see  Sect.  2.3)  and  associated  fixed 
lag  smoother  (see  Sect.  2. 3. 1.2)  will  estimate  target  position,  velocity,  and  acceleration.  The 
filter  (tracker)  development  for  this  chapter  uses  conventional  “kinematic”  measurements 
of  target  position  (range  and  pointing  angle)  and  range  rate  (doppler  velocity)  only  -  we  do 
not  assume  availability  of  pose  estimates  or  other  information  regarding  target  acceleration 
state  or  aspect  angle. 

To  estimate  rotational  states  or  (equivalently  here)  aspect  angle  from  translational 
kinematics  for  aircraft  targets,  it  is  essential  to  determine  the  magnitude  and  direction 
of  the  target’s  normal  load  acceleration  (total  target  acceleration  normal  to  the  velocity 
vector,  less  gravity)  over  the  time  of  interest.  Use  of  the  smoother  is  therefore  critical, 
due  to  our  inability  to  observe  the  pilot’s  commanded  attitude  changes  in  near-real  time, 
as  Kendrick  et  al.  could  using  pose  estimates  (see  Sect.  2.3.3).  Smoother  equations  (and 
results)  are  given  in  App.  C.6. 

The  extended  Kalman  filter  employed  here  has  a  standard  nine-state  filter  model  - 
target  position,  velocity,  and  acceleration  along  each  of  three  orthogonal,  assumed  inertial 
axes  in  the  filter  frame.  The  target  acceleration  model  is  the  standard  Singer  model  as 
discussed  in  Sect.  2.3.2. 1.  Following  the  conventions  for  this  model,  the  target  acceleration 
standard  deviation  along  each  axis  is  assumed  to  be  32  feet  per  second  squared,  and  the 
target  maneuver  correlation  time  is  assumed  to  be  3  seconds.  Consistent  with  the  discussion 
in  Sect.  2. 3.2.1,  these  parameters  define  the  filter  dynamics  driving  noise  strength  Q  for 
the  acceleration  states  -  no  pseudonoise  was  added  to  other  states. 

The  measurement  model  assumptions  for  target  position  and  range  rate  are  con¬ 
sistent  with  those  presented  for  the  filter  model  in  the  previous  chapter.  Basically,  (1) 
measurements  are  assumed  to  be  available  every  0.1  seconds,  (2)  target  position  error  is 
modeled  as  white,  Gaussian  noise  of  standard  deviation  100  feet,  and  (3)  range  rate  error 
is  modeled  as  white,  Gaussian  noise  of  standard  deviation  100  feet  per  second.  Ownship 
position  and  the  direction  and  magnitude  of  gravity  are  assumed  to  be  known  perfectly  - 
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the  amount  of  error  in  target  aspect  angle  estimates  from  these  assumptions  is  negligible 
compared  to  error  contributions  from  target  position  and  velocity  measurement  noise. 

Ultimately,  it  was  found  that  even  acceleration  estimates  from  the  smoother  were  too 
“noisy”  to  provide  the  desired  accuracy  and  smoothness  in  estimates  of  target  acceleration. 
The  smoother  provided  an  excellent  estimate  of  target  position,  however  -  much  improved 
over  the  raw  filter  position  estimate,  and  robust  to  mismatches  in  target  model  parameters. 
For  that  reason,  target  acceleration  was  estimated  by  fitting  target  position  in  each  inertial 
dimension  to  a  second  degree  polynomial  curve,  and  differentiating  the  curve  parameters 
twice  to  derive  acceleration. 

5.5.3  Developing  the  Kinematic  Aspect  Angle  Path  and  Error  Bounds.  The 
kinematic  aspect  angle  path  is  important  for  two  reasons.  First,  for  the  true  target,  or  truth 
model,  this  path  defines  the  aspect  angle  locations  that  generate  the  observed  or  measured 
signatures.  Second,  for  candidate  target  models  in  a  target  recognition  algorithm,  this  path 
defines  a  nominal  set  of  points  around  which  associations  will  be  made  between  measured 
and  model  signatures,  out  to  some  aspect  angle  limit  defined  by  an  appropriate  set  of 
bounds.  A  typical  path  and  bounds  were  shown  in  Fig.  2.9.a. 

Both  true  and  kinematically-estimated  aspect  angle  paths  are  defined  by  tracing  the 
path  over  time  of  the  target-to-sensor  vector,  denoted  rj_,,  coordinatized  in  body  frame 
coordinates  (as  shown  by  the  superscript  6),  with  time  derivatives  observed  with  respect  to 
the  target  body  frame  or  hypothetical  aspect  angle  sphere  (for  which  derivatives  a  second 
superscript  will  be  added  to  r*_f  when  required).  For  the  truth  or  target  model,  this  path 
can  be  obtained  directly  -  for  the  motion  warping  algorithm,  it  must  be  estimated  based 
on  target  kinematics  and  known  dynamic  restrictions  of  candidate  target  classes. 

5.5.3. 1  Developing  the  Kinematic  Aspect  Angle  Path  -  Aircraft  Targets. 
The  effort  described  in  this  section  has  two  basic  objectives.  First,  we  seek  to  define  a  nom¬ 
inal  or  kinematically-estimated  aspect  angle  path  and  path  angular  rate,  for  a  given  target 
model  using  some  set  of  kinematic  measurements.  Second,  using  information  developed 
for  the  first  objective,  the  latter  part  of  this  section  will  establish  estimated  covariances  for 
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the  nominal  angles  and  their  rates.  These  covariances  will  be  used  to  define  aspect  angle 
windows,  motion  warping  path  constraints,  and  aspect  angle  state  transition  likelihoods. 

See  Figs.  5.23  and  5.24  for  vector  and  angle  definitions  in  the  following  discussion. 
Recall  that  the  velocity  frame  vector  is  defined  by  the  direction  of  the  flight  path,  that 
the  y„  vector  is  defined  to  lie  normal  to  x*  in  a  plane  parallel  to  the  local  horizontal  (the 
n  —  e  plane),  and  that  z„  is  defined  pointing  generally  downward  to  form  a  right-handed 
set.  Note  also  that  the  perspective  in  Fig.  5.24  is  somewhat  misleading  for  a  conventional 
aircraft,  due  to  the  author’s  artistic  limitations  -  actually,  showing  a  positive  lift  in  the 
velocity  frame  y„  -  zv  plane,  as  the  drawing  implies,  should  tend  to  show  more  of  the 
ventral  or  bottom  surface  of  the  aircraft,  due  to  the  required  angle  of  attack  a. 

Using  the  acceleration  estimate  derived  from  the  polynomial  curve  fit  to  the  smoothed 
trajectory,  the  baseline  (coordinated  turn)  system  will  estimate  aspect  angle  for  each  po¬ 
tential  target  class,  using  essentially  the  approach  prescribed  for  the  Kendrick  estimator. 
That  is,  acceleration  normal  to  the  velocity  vector  is  assumed  due  to  wing-generated  lift  and 
gravity  only,  with  no  component  from  thrust,  aerodynamic  side-forces,  or  other  sources. 
The  relationship  between  lift  and  angle  of  attack  is  given  by: 

L  =  \p*tmV2SCLaa  (5.1) 

where: 

L  —  lift  force  magnitude  (aircraft  mass  assumed  known) 

Patm  —  atmospheric  density 
V 2  =  velocity  magnitude  squared 
S  =  aerodynamic  surface  area 
C La  -  coefficient  of  lift 
a  =  angle  of  attack 

Thus,  for  these  conventionally-controlled  aircraft  classes,  normal  load  magnitude  will 
determine  angle  of  attack  for  a  given  aircraft  class  and  velocity,  and  normal  load  direction 
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vector  orientation) 

Figure  5.24.  Accelerations  for  Coordinated  Turns 
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will  determine  roll  angle.  Sideslip  velocity  (component  of  velocity  normal  to  the  body 
frame  xb  -  zb  plane)  will  be  assumed  zero.  Consistent  with  the  modelling  simplifications  in 
the  Kendrick  and  Andrisani  efforts  (120,  5]  and  other  analyses  [104,  143],  our  atmosphere 
will  be  considered  at  rest  with  respect  to  the  inertial  frame,  with  on-line  corrections  to  be 
implemented  if  target-local  wind  velocity  is  available.  Note  also  that,  by  arbitrary  choice 
of  the  direction  of  xb,  we  have  defined  the  zero-lift  angle  of  attack  (ab,  as  in  Chapter  IV) 
to  equal  zero.  Unavoidable  real  world  deviations  from  these  “zero-nominal”  assumptions 
will  induce  some  error  into  our  estimates  of  aspect  angle  from  kinematics,  but  we  will  size 
our  search  areas  in  aspect  angle  adequately  to  deal  with  expected  deviations. 

For  “control  configured”  classes  of  aircraft,  unconventional  assumptions  will  be  made 
to  provide  aspect  angle  as  a  function  of  kinematics  -  for  example,  an  aircraft  with  “tum- 
like-a-car”  dynamics  might  be  assumed  to  have  only  a  nominal  angle  of  attack,  generally 
keeping  the  target  body  vector  aligned  with  the  velocity  vector,  but  rolling  in  the  direction 
of  a  turn  for  pilot  comfort  and  visibility.  In  our  research  to  date,  we  have  assumed  only 
a  conventional  aircraft’s  coordinated  turn  motion  -  note  that  for  tiny  control  method  in 
which  the  plane  of  the  wings  is  essentially  normal  to  the  lift  vector,  minor  deviations  from 
the  coordinated  turn  dynamics  result  only  in  an  aspect  angle  position  bias  error  which  is 
ignored  by  our  algorithms.  Recall  however,  as  discussed  in  Sect.  3.6.5,  that  inherently  our 
recognition  algorithms  are  multiple  model  estimators  -  different  possibilities  or  assumptions 
as  to  dynamics  add  a  second  dimension  to  the  range  of  hypotheses  for  which  we  must 
compare  models  to  observations,  based  on  discrete  choices  for  the  dynamics  assumptions. 
The  fundamental  dimension  in  this  range  of  hypotheses  is  of  course  described  by  the 
different  target  signatures  as  functions  of  aspect  angle. 

Given  some  assumption  on  target  body  frame  orientation  relative  to  the  target  ve¬ 
locity  vector,  we  need  to  define  the  orientation  of  the  body  frame  relative  to  the  inertial 
frame.  This  orientation  is  provided  by  the  following  computation  involving  direction  cosine 
matrices,  which  yields  the  body  frame  unit  vectors  in  inertial  frame  coordinates: 

F6  =  CjC’I  =  C?  (5.2) 
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where: 


Fb  =  a  three  by  three  matrix,  each  row  of  which  defines  a  target  body  frame  unit 
vector  in  inertial  frame  coordinates,  and 


cos(a) 

sin(p)  sin(d!) 

-  cos(p)  sin(a) 

0 

cos(p) 

sin(p) 

sin(a) 

-  sin(p)  cos(a) 

cos(p)  cos(a) 

cos(»;)  cos(7)  sin(?7)  cos(7)  -  sin(7) 
-  sin (77)  cos(tj)  0 

cos  (77)  sin(7)  sin(j;)  sin(7)  cos(7) 


(5.3) 


(5.4) 


and  Jill  angles  are  defined  in  Figs.  5.23  and  5.24. 

Thus,  the  velocity  frame  is  found  relative  to  the  inertial  frame  by  use  of  the  Euler 
angles  identified  as  heading  (tj)  and  flight  path  (7)  angles.  The  body  frame  is  found 
relative  to  the  velocity  frame  by  considering  (for  coordinated  turns  with  zero  sideslip 
angle)  the  calculated  roll  angle  ( p )  and  angle  of  attack  (a)  required  as  discussed  above  to 
generate  the  required  normal  acceleration.  Other  appropriate  assumptions  are  made  for 
unconventionally  controlled  vehicles.  Again,  for  the  truth  model  or  simulated  target,  these 
quantities  are  known  -  for  the  motion  warping  processor,  they  are  estimated. 

Given  C*,  the  taxget-to-sensor  unit  vector  is  defined  in  target  body  frame  coordinates 
as 


w*  - 


C? 


(5.5) 


where  we  have  simply  divided  the  target-to-sensor  vector  in  inertial  coordinates  by  its 
magnitude  and  performed  the  coordinate  transform. 

With  the  target-to-sensor  unit  vector  defined  in  target  body  frame  coordinates,  we 
now  calculate  the  target  aspect  angle  (azimuth  k  and  elevation  A  of  the  target-to-sensor 
vector,  see  Fig.  5.23)  using  the  following  relations  (where  the  superscript  bx,  for  example, 
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means  the  component  of  the  associated  target-sensor  unit  vector  in  the  target  body  frame 
x4  direction): 


(5.6) 


for  positive  rfl,  and  rf*,,  or 


k  =  tan-1  j  +  360° 
for  negative  rt6“f  and  positive  rbZ,  ,  or 


K=tan_1  (¥:)+1800 


(5.7) 


(5.8) 


for  positive  or  negative  r*t_#  and  negative  r, 


t-„  ana 


for  positive  r?l„  or 


A  =  sin  \-r?z,) 
A  =  sin"1^,) 


(5.9) 


(5.10) 


for  negative  r?Z,  (note  that  due  to  SRC  model  definitions,  positive  elevation  angles  are 
toward  the  negative  zb  direction,  as  shown  in  Fig.  5.23). 

Using  the  time  derivatives  of  the  matrix  C*  and  the  target-to-sensor  vector  in  inertial 
frame  coordinates,  we  can  define  the  time  derivative  of  the  target-to-sensor  unit  vector  with 
respect  to  the  body  frame,  in  target  frame  angular  coordinates.  This  quantity  will  define 
the  (nominal)  angular  rate  along  the  kinematically- computed  aspect  angle  path.  The  rate 
of  change  of  the  target-to-sensor  vector,  with  respect  to  the  body  frame,  defined  in  target 
body  frame  coordinates,  is  given  by: 


dr¥. 

dt 


=  C“ 


bdr\'_. 


dt 


+  Cr! 


i-i 


(5.11) 


where: 
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C*  =  the  time  derivative  of  Cj,  element  by  element 

and  the  first  superscript  on  derivatives  of  r  refers  to  coordinatization,  while  the  second 
refers  to  the  frame  with  respect  to  which  a  derivative  is  taken. 

Dividing  the  above  rate  by  the  length  of  r®_4  and  taking  the  component  of  this 
normalized  rate  perpendicular  to  the  unit  vector  yields  the  traverse  rate,  in  radians  per 
second,  of  the  aspect  angle  vector  along  an  instantaneous  “great  circle”  path  (of  generally 
constantly  changing  direction)  on  the  hypothetical  aspect  angle  sphere  of  unit  radius. 
This  traverse  rate  is  now  the  estimated  mean  aspect  angle  rate.  If  desired,  this  rate  can 
be  isolated  into  components  in  azimuth  and  elevation  directions,  as  defined  in  Fig.  5.23. 

Thus,  by  the  previous  sequence  of  computations,  we  have  estimated  the  (kinemat¬ 
ically  observable )  angular  position  and  rate  of  ibt_>  (as  observed  from  the  target  body 
frame).  Recapping,  these  aspect  angle  changes  are  assumed  primarily  due  to  (1)  pitch 
control  (changing  angle  of  attack)  to  change  the  magnitude  of  normal  acceleration,  or  (2) 
roll  control  to  change  the  direction  of  normal  acceleration,  or  (3)  (less  rapid)  target  motion 
along  curved  portions  of  the  trajectory  (no  change  in  acceleration  relative  to  the  target 
body  frame),  or  (4)  motion  of  the  sensor  along  the  ownship  trajectory  (the  latter  assumed 
known). 

Clearly,  in  the  absence  of  normal  acceleration  followed  in  turn  by  velocity  and  position 
changes,  target  aspect  angle  changes  Me  unobservable,  in  the  estimation  theory  sense,  to 
our  kinematic  estimator.  The  absence  of  normal  acceleration  implies  that  the  aircraft  is 
flying  an  essentially  straight  (not  necessarily  level)  trajectory,  perhaps  with  a  non-zero  roll 
rate.  In  any  case,  this  class  of  trajectory  is  of  minor  interest  to  us  for  two  reasons.  First,  a 
combatant  (as  opposed  to  commercial  or  military  transport)  aircraft  target  seldom  flies  in 
a  straight  line  for  long  periods,  unless  the  pilot  is  unaware  of  being  tracked  (it  is  generally 
assumed  that  an  adversary’s  warning  systems  will  alert  him  to  the  presence  of  active 
sensors).  Second,  the  fact  that  the  target  is  flying  a  straight  trajectory  implies  (unless  the 
target  is  rolling)  that  the  target-sensor  aspect  angle  is  not  changing  significantly  -  thus, 
the  high  information  content  characteristic  sequences  that  we  seek  will  probably  not  be 
available  anyway. 
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However,  should  we  desire  to  estimate  the  target’s  orientation  in  straight  flight, 
reasonable  assumptions  can  be  made.  In  most  cases,  the  aircraft  will  be  flying  with  “wings 
level”,  i.e.,  the  plane  containing  the  body  frame  xb  and  zb  unit  vectors  will  be  perpendicular 
to  the  local  horizontal  plane,  and  the  body  frame  pitch  angle  can  be  estimated  from  the 
flight  path  angle  and  the  angle  of  attack  required  to  counteract  the  normal  component 
of  gravity.  In  Sect.  6.2,  techniques  derived  from  word  spotting  in  continuous  speech  with 
dynamic  time  warping  will  be  proposed  to  identify  target  rolling  motion  and  class  in  the 
absence  of  normal  acceleration. 

5. 5. 3. 2  Bounds  Around  the  Kinematic  Aspect  Path.  The  preceding,  or  first 
part  of  this  discussion  has  addressed  only  the  determination  of  the  nominal  kinematically- 
estimated  aspect  angle  path.  The  ranges  or  bounds  of  probable  errors  in  the  start  and  end 
points  of  this  path,  in  the  angular  rates  at  which  the  path  is  traversed,  and  “off-nominal” 
errors  can  be  estimated  from  the  statistics  of  the  errors  in  the  kinematic  estimates  for 
velocity  and  acceleration. 

The  aspect  angle  position  and  rate  error  bounds  will  in  turn  define  the  continuity 
constraints  for  dynamic  programming  sequence  comparison  methods.  With  reference  to 
the  “gridded”  aspect  angle  region  in  Fig.  2.9.a,  these  bounds  will  define  the  width  of 
this  region,  and  the  length  of  this  region,  or  “window”  to  be  checked  for  any  one  target 
signature. 

The  problem  here  is  that  the  relationships  between  kinematic  state  estimate  errors 
and  aspect  angle  estimate  errors  are  highly  nonlinear  in  general.  For  this  research,  adequate 
linear  approximations  were  made  by  defining  distributions  of  starting  values  about  the 
nominal  positions  and  rates  according  to  the  equation: 

PA  =  EPEt  (5.12) 


where: 
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PA  =  is  a  first  order  (linearized)  covariance  estimate,  at  any  nominal  aspect  angle 
along  the  trajectory,  for  the  error  in  the  angular  position  and  angular  rate  of  the  nominal 
aspect  path,  in  the  direction  of  and  normal  (cross-track)  to  the  path  (a  4x4  matrix). 

E  =  is  a  4  (row)  x  6  (column)  dimensional  Jacobian  matrix  of  partial  derivatives, 
defined  by  determining  the  partial  derivatives  of  angular  position  and  rate  along  and  nor¬ 
mal  (cross-track)  to  the  nominal  path  (four  separate  quantities),  with  respect  to  the  target 
velocity  and  acceleration  components  along  each  inertial  frame  axis  (a  total  of  six  compo¬ 
nents). 

P  =  is  the  (estimated,  from  an  extended  Kalman  filter  and  associated  smoother) 
covariance  of  the  target  velocity  and  normal  acceleration  estimates. 

The  along-track  and  cross-track  angular  error  bounds  at  each  measurement  observa¬ 
tion  time  allow  us  to  subdivide  the  total  aspect  angle  region  shown  in  Fig  2.9  into  allowable 
aspect  angle  regions,  or  windows,  at  each  time.  These  windows  or  subregions  will  in  general 
overlap,  as  shown  in  Fig.  3.2  in  Chapter  III. 

The  along-track  and  cross- track  angular  rate  statistics  given  by  P^,  then,  allow  us 
to  estimate  quantitatively  the  likelihood  of  transitions  between  any  two  aspect  angle  states 
on  adjacent  windows  over  a  given  measurement  time  interval  AT  —  tk+1  —  tk.  Since  we 
have  an  estimate  for  the  mean  aspect  angle  rate  as  found  in  the  previous  section,  we  can 
estimate  the  natural  logarithm  LT  of  the  Gaussian  classical  likelihood  of  the  rate  implied 
by  a  such  a  transition  (less  the  usual  constant  term).  This  in  turn  can  be  taken  to  provide 
the  natural  logarithm  of  the  a  priori  aspect  angle  transition  likelihood  p(x£+1  |  xk,  Z?n,u>,), 
as  discussed  in  Sect.  3.6  with  respect  to  the  Larson  and  Peschon  equations: 


Lt  =  [lnp(x£+1  |  x£,  Z*  ,u>,)]  -  C 
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where: 

Lt  =  the  log  Gaussian  classical  likelihood  (less  the  usual  constant  term  C)  of  the 
transition  rate  implied  by  movement  in  aspect  angle  between  two  given  aspect  angle  cells 
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x£  and  x£+1  over  one  measurement  interval  AT,  for  some  target  class  o>j,  given  m  kinematic 
measurements  Z^, 

Pyin  =  a  2x2  matrix,  the  lower  right  sub-matrix  of 

A xAT  =  the  along-track  component  of  the  difference  in  angle  between  the  two  given 
aspect  angle  cells  x£  and  x£+1  (i.e.,  the  angle  component  along  the  direction  of  the  mean 
expected  aspect  angle  rate) 

AxaCT  —  the  cross-track  component  of  the  difference  in  angle  between  the  two  given 
aspect  angle  cells  xjj  and  x£+1 

rAT  =  the  mean  expected  aspect  angle  rate  in  the  along-track  direction,  which  is 
found  as  discussed  in  the  previous  section  (see  Eqn.  (5.11)  and  associated  comments) 

tCt  =  the  mean  expected  aspect  angle  rate  in  a  cross-track  direction,  which  is  equal 
to  zero  by  definition 

C  =  the  usual  factor  associated  with  the  natural  logarithm  of  the  leading  term  of  a 
Gaussian  probability  density,  a  constant  for  constant 

and  other  terms  have  been  defined. 

The  natural  logarithm  of  the  likelihood  for  the  a  priori  starting  state  p(x“  |  Zd,u ;<)  is 
defined  in  an  analogous  fashion,  using  the  upper  left  2x2  submatrix  of  P^,  corresponding 
to  covariance  of  angular  position  about  the  mean  or  nominal  aspect  angle  estimate  from 
kinematics  at  time  t0  (strictly  following  the  Larson  and  Peschon  methodology,  this  is  one 
time  interval  prior  to  the  first  signature  measurement). 

It  is  important  to  remember  that  the  accuracy  of  these  estimates  for  angular  position 
and  rates  and  associated  transition  likelihoods  are  completely  dependent  upon  the  extent 
to  which  the  kinematic  state  estimator  model  matches  the  actual  target  behavior.  Recall 
that  in  the  effort  discussed  in  this  chapter,  we  assume  that  the  target  acceleration  relative 
to  the  body  frame  is  constant  -  in  other  words,  a  constant  turn  rate  model  (not  necessarily 
a  planar  turn).  We  depend  here  upon  the  analysis  of  kinematic  observations  (by  extended 
Kalman  filter  /  smoother  /  curve  fit)  to  ensure  that  the  target  is  in  fact  executing  a 
constant  acceleration  turn  as  HRR  radar  signatures  are  taken.  In  Chapter  VI,  we  will 
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combine  ideas  from  Sect.  3.4  and  Chapter  IV  to  free  us  from  dependence  on  the  smoother 
and  constant  acceleration  turn  model,  as  previewed  in  Sect.  3.9. 

Straightforward  extensions  of  Eqn.  (5.12)  allow  for  calculation  of  angular  position 
and  rate  error  “covariances”  due  to  filter  state  errors  and  errors  in  assumption  of  other 
variables  such  as  coefficient  of  lift  Cl<x  (normally  assumed  at  some  nominal),  sideslip  angle 
(normally  assumed  zero),  as  well  as  off-nominal  path  errors  and  error  rates.  Note  that  this 
technique  can  also  be  used  to  estimate  a  (pseudo-)  measurement  error  “covariance”  for 
the  aspect  angle  pseudo-measurement  provided  to  the  aspect  angle  Kalman  filter  in  the 
Kendrick  estimator  (this  value  was  a  fixed  input  in  the  Kendrick  research). 

These  careful  computations  of  aspect  angle  position  and  rate  error  may  not  seem 
worth  the  effort,  but  they  yield  a  great  deal  of  interesting  information.  They  show,  for  ex¬ 
ample,  that  the  usual  assumption  of  “square”  [164]  or  “circular”  [20]  aspect  angle  windows 
for  searches  on  aircraft  target  models  is  probably  suboptimal  when  target  kinematics  are 
reasonably  well  known  -  generally,  the  window  extent  should  be  greater  in  the  direction 
of  possible  angle  errors  due  to  target  roll  than  it  is  in  the  direction  of  angle  errors  due  to 
changes  in  pitch  or  angle  of  attack.  Since  computational  loading  is  driven  by  window  size 
(and  the  discretization  fineness  or  granularity  of  aspect  angle  cells  in  the  window)  optimal 
sizing  of  aspect  angle  windows  could  be  a  significant  issue. 

For  most  of  the  trajectories  in  this  research,  the  “along-track”  direction  corresponds 
to  changes  in  pitch,  and  the  “cross-track”  direction  corresponds  to  changes  in  roll.  Accord¬ 
ingly,  a  typical  aspect  angle  window  size  in  this  research  was  10  degrees  in  the  along-track 
direction  by  20  degees  in  the  cross-track  direction.  For  a  two-g  turn  tracking  scenario  like 
that  for  Fig.  5.27  and  others  to  be  discussed  in  Sect.  5.6,  this  corresponds  to  approximately 
+/-  two  standard  deviations  of  angular  position  error  for  both  along-track  and  cross-track 
aspect  angle  position,  as  estimated  from  kinematics,  or  correspondingly  more  standard 
deviations  of  error  for  higher  acceleration  turns  (these  angular  standard  deviations  are  the 
square  roots  of  the  first  two  diagonal  elements  of  the  matrix  above).  Simply,  the  harder 
the  aircraft  turns,  the  more  one  can  be  sure  of  its  aspect  angle. 
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5. 5. 3. 3  Ground  Targets.  Derivation  of  Kinematically-estimated  aspect  an¬ 
gle  and  aspect  angle  rate  for  ground  vehicles  is  quite  analogous  to  the  above,  except  that 
the  coordinated  turn  assumption  (or  other  assumption  as  appropriate)  is  replaced  by  as¬ 
sumptions  that  the  vehicle  is  generally  pointed  in  the  direction  of  its  velocity  vector,  and 
that  turns  occur  on  a  flat  surface,  or  that  the  surface  at  the  vehicle  location  is  reasonably 
well  known  (as  from  stored  map  data). 

5.5.4  Extracting  the  Kinematically-Estimated  Feature  Observable  Sequence.  As 
shown  in  Fig.  3.2,  we  have  used  kinematic  measurements  to  define,  for  each  potential  target 
class,  a  set  of  aspect  angles  or  “map”  over  which  we  will  make  comparisons  with  observed 
signature  measurements.  The  next  step  is  to  load  or  associate  the  map  “cells”  with  library 
signatures  from  the  database  corresponding  to  the  desired  target  model. 

There  are  any  number  of  more  or  less  efficient  ways  to  approach  this  software  en¬ 
gineering  issue.  In  this  research,  the  map  is  simply  a  Fortran  array  which  can  be  loaded 
with  aspect  angle  values,  and  the  library  signature  value  closest  to  each  map  aspect  angle 
is  found  by  testing  each  library  signature’s  aspect  angle  against  each  map  value  as  the 
program  reads  the  entire  library  signature  file.  If  a  library  signature  within  one  degree  in 
aspect  and  ten  degrees  of  polarization  is  not  found  for  each  map  cell  during  the  library 
read  operation,  an  error  advisory  prints  to  output. 

For  actual  on-line  operation,  a  much  more  efficient  method  would  be  to  “point”  or 
reference  the  map  location  to  the  desired  library  element.  That  was  not  done  for  this 
research  to  avoid  keeping  all  of  the  libraries  loaded  in  storage  at  all  times.  This  and 
subsequent  processes  can  be  performed  identically  for  ground  and  air  targets. 

5.5.5  Comparing  the  Measured  Sequence  to  the  Candidate  Target  Signatures.  At 
this  point  we  have  a  sequence  of  actual  signature  observations  from  some  “true”  target, 
and  an  aspect  angle  “map”  loaded  with  appropriate  signatures  for  some  candidate  target 
class.  The  map  is  subdivided  into  aspect  angle  windows,  or  search  bounds  corresponding  to 
each  signature  measurement  time.  We  wish  to  compare  the  observed  signatures  to  the  map 
signatures  and  quantify  the  maximum  joint  likelihood  (or  a  comparable  measure)  that  the 
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observed  signatures  came  from  the  candidate  target  class.  A  number  of  algorithms  have 
been  investigated  for  this  task. 

Those  algorithms  to  follow  which  employ  dynamic  programming  are  implemented  in 
simple  code  structures  of  the  following  form  (see  [176:App.  B]).  Note  that  computations 
for  the  many  aspect  angle  cell  /  target  model  combinations  can  and  should  be  done  in 
parallel  for  fast  on-line  execution,  but  that  was  not  implemented  in  our  simulations. 

(1)  (Loop  1)  For  each  window  (equivalently,  the  current  signature): 

(2)  (Loop  2)  For  each  aspect  angle  cell  in  the  current  window  (the  current  cell): 

(3)  Compute  likelihood  or  cost  of  generating  the  current  signature  from  this  cell. 

(4)  Compute  allowable  predecessor  cells  (none  for  first  window). 

(5)  (Loop  3)  For  each  allowable  predecessor  cell: 

(6)  Compute  joint  sequence  likelihood  to  current  cell,  or  other  measure  of  path  cost. 

(7)  End  Loop  3. 

(8)  Select  allowable  predecessor  cell  giving  maximum  joint  likelihood  or  minimum  total 
cost  to  current  cell,  and  store  this  value,  predecessor  cell  location  (pointer)  and  other  data 
as  required,  referenced  to  the  current  cell  location. 

(9)  End  Loops  1  and  2. 

(10)  Select  maximum  joint  likelihood  /  minimum  total  cost  cell  in  fined  or  latest  window 
and  retrace  pointers  to  find  best  path. 

We  now  discuss  the  individual  algorithms  compared  in  this  research.  The  discus¬ 
sion  to  follow  assumes  the  use  of  classical  likelihood  functions  or  related  quantities  (e.g., 
Mahalanobis  metrics)  for  observed  signature-to-map  signature  comparisons. 

5.5.5. 1  The  Independent  Look  (IL)  Algorithm.  The  IL  Algorithm  is 
a  conventional  “independent  look”  decision  theoretic  target  recognizer  as  discussed  in 
Sect.  2.2.  No  restriction  is  placed  on  the  resulting  “pose  estimate”  sequence  -  we  find 
MAX  ln[p(z{  |  xjj  n,u>,)]  within  the  specified  aspect  angle  window  at  each  measurement 
time  tk.  These  values  are  summed  incrementally  at  each  event  to  find  the  natural  logarithm 
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of  the  maximum  joint  likelihood.  With  reference  to  Fig.  3.2,  this  means  that  the  recognizer 
is  free  to  choose  any  sequence  of  aspect  angles  on  the  map,  so  long  as  each  aspect  angle 
choice  falls  within  the  proper  window. 

5. 5. 5. 2  The  Perfect  Knowledge  of  Aspect  (PKA)  Algorithm.  The  PKA 
Algorithm  provides  an  upper  bound  on  recognition  performance  in  that  it  assumes  that  the 
recognizer  knows  perfectly  the  true  aspect  angle  over  time  for  each  target  class  executing 
the  observed  maneuver.  For  the  Mahalanobis  metric,  this  figure  is  simply  the  natural 
logarithm  of  the  joint  maximum  (classical)  likelihood  for  known  or  true  aspect  angle  x“n 
over  time,  In  {IIn=i  [p(zn  I  x£„,  <*>*)]}•  These  values  are  summed  incrementally  at  each 
event  to  find  the  natural  logarithm  of  the  joint  likelihood.  Note  that  if  the  kinematic 
aspect  angle  estimate  is  poor,  or  aspect  angle  error  bounds  are  too  low,  the  sequence  of 
true  aspect  angles  may  not  even  fall  on  the  map  in  Fig.  3.2,  even  for  the  true  target. 

5. 5. 5. 3  The  Fixed  Bound  (FB)  Algorithm.  The  FB  Algorithm  is  an  im¬ 
plementation  of  the  Le  Chevalier  algorithm,  as  that  approach  is  believed  to  work:  the 
algorithm  has  no  information  from  kinematics  on  the  expected  direction  of  aspect  angle 
change,  but  knows  that  the  change  is  bounded.  No  subsequent  processing  is  applied.  This 
algorithm  was  illustrated  in  Fig.  3.11.  The  algorithm  finds  the  natural  log  of  the  term  in 
Eqn.  (5.14)  (a  modification  of  Eqn.  (3.11))  as  discussed  in  Sect.  3.6.  Note  that  the  term 
P(  I  ,u>i)  means  specifically  that  this  term  is  the  joint  classical  likelihood  of  the 
observed  Z{ ,  conditioned  on  their  having  been  associated  with  the  aspect  angle  sequence 

found  by  the  fixed  bound  process  on  model  uj,. 

P(^l  I  X/A  A,wt)  = 

\  / 

-  MAX  J*(xj  B,  fc  |  u\)  1  =  MAX  \p(z[  |xj,«i)/*(x2_ltm,ft-l|«i)]l  (5.14) 

vo  vn 

xlfc,n  x*,« 

5. 5. 5. 4  The  Larson  and  Peschon  (L&P)  Algorithm.  The  L&P  Algorithm 
finds  the  natural  log  of  the  term  in  Eqn.  (3.11).  Note  that  the  numerical  values  of 
this  algorithm  include  contributions  due  to  the  a  priori  aspect  angle  transition  likelihood 
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p(x£+ln  |  x£n,Z Contributions  associated  with  the  L&P  “a  priori ’  state  xg  n  were 
not  included,  to  provide  for  unbiased  comparison  with  the  results  of  other  algorithms  (none 
of  which  include  such  a  term,  although  they  could  in  theory). 

The  process  involved  in  calculating  the  likelihood  of  a  given  path  was  illustrated  in 
Figs.  3.3  and  3.4  of  the  last  chapter.  Computations  for  this  algorithm  proceed  exactly  as 
for  the  FB  algorithm,  except  that  log  transition  likelihoods  are  summed  along  with  log 
likelihoods  of  observed  signature  generation. 

5. 5. 5. 5  Classical  Sequence  Comparison  -  One  Dimensional  (1-D  Warp). 
The  1-D  Warp  Algorithm  defines  contiguous  one- dimensional  paths  in  aspect  angle,  par¬ 
allel  to  and  including  the  “nominal”  path  given  by  the  extended  Kalman  filter/smoother 
kinematic  estimate.  An  algorithm  functionally  identical  to  one-dimensional,  unrestricted 
endpoint  [176,  182]  dynamic  time  warping  is  performed  along  each  trajectory,  as  discussed 
in  Sects.  2.4.2  and  3.7.2.  The  basic  form  requires  contiguous  matching  -  every  aspect  angle 
cell  along  a  given  path  must  be  matched  to  a  measurement.  In  this  research,  each  local 
path  cost  is  normalized  by  the  total  number  of  associations  along  that  path,  and  the  path 
with  minimum  normalized  cost  at  the  final  (or  latest)  aspect  angle  window  is  selected  as 
the  best  path.  The  output  likelihood  value  can  be  taken  from  this  normalized  cost,  or  it  can 
be  taken  from  the  “best”  individual  match  for  each  signature  along  the  selected  contiguous 
path  (the  latter  choice  is  used  in  the  results  shown  below,  to  achieve  the  minimum  cost  or 
maximum  likelihood  on  the  true  model,  for  better  comparison  with  the  other  algorithms). 

Best  path  selection  based  on  normalized  cost  is  a  departure  from  usual  DTW  practice, 
and  can  lead  to  violations  of  the  “Principle  of  Optimality”  [23],  but  worked  well  in  our  tests, 
since  for  the  proper  match  of  measurements  to  target  class,  and  constant  measurement 
noise  statistics  across  the  target  length,  the  local  average  matching  cost  is  expected  to  be 
near  the  global  average.  Without  such  normalization,  however,  in  the  scenarios  used  here, 
conventional  (“contiguous”)  sequence  warping  often  fails  to  follow  an  optimal  path  on  the 
true  (unknown  a  priori)  target  model  (i.e. ,  a  path  close  to  the  origin  locations  of  the  true 
signatures).  This  failure  in  turn  causes  a  poor  reference  point  for  comparison  to  matches 
on  the  incorrect  models. 
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The  desire  for  local  path  normalization  is  driven  by  a  particular  combination  of 
factors  in  the  scenarios  investigated  here.  In  a  typical  case  for  this  research,  we  may  have 
six  observations  taken  at  0.4  second  intervals,  giving  an  elapsed  time  of  2.4  seconds.  Now 
over  this  same  period,  an  aircraft  target  executing  a  constant  4-g  turn  in  some  horizontal 
plane  at  a  speed  of  800  meters  per  second  will  exhibit  a  22.0  degree  aspect  angle  change 
to  a  stationary  observer  in  the  same  plane.  Assuming  a  1.0  degree  granularity  signature 
map,  and  some  overlap  (say  5  degrees)  at  each  end  of  the  aspect  angle  “map”  to  allow  for 
bias  uncertainties  in  the  target  aspect  angle,  we  may  have  22  +  2(5)  =  32  horizontal  cells 
in  our  aspect  angle  map.  Thus,  the  ratio  of  elements  in  the  two  sequences  to  be  compared 
is  6/32. 

If  the  correct  association  proceeds  in  an  essentially  linear  fashion  (6  observations 
matching  to  22  map  cells),  we  expect  that  each  observation  will  associate  with  approx¬ 
imately  4  map  elements.  This  6/22  ratio  is  much  less  than  the  1/3  ratio  that  previous 
researchers  in  speech  processing  found  to  be  the  minimum  acceptable  [168,  181].  The 
practical  effect  of  this  small  ratio  is  that,  without  some  form  of  compensation,  the  six 
observations  will  match  to  considerably  fewer  of  the  map  elements  than  22.  For  the  true 
target,  this  will  result  in  a  match  that  is  non-optimal  in  a  practical  sense  -  providing  sig¬ 
nature  likelihoods  that  are  a  poor  basis  for  comparison  with  results  from  incorrect  target 
model  associations  (of  course,  the  true  target  is  unknown  a  priori ).  The  probability  of 
correct  recognition  will  be  very  low. 

The  problem  is  that  the  classical  dynamic  programming  equations  given  in  Sect.  2.4.2 
lead  toward  a  bias  for  shorter  paths  in  the  lattice  of  Fig.  2.8.  Previous  researchers  have 
addressed  this  issue  by  a  wide  variety  of  continuity  constraints  that  reduce  the  penalty 
for  longer  or  “non-diagonal”  paths,  such  as  the  method  discussed  in  Sect.  5.4.1.  Several 
researchers  have  pointed  out  that  the  objective  is  a  path-length  normalized  warping  cost, 
but  have  maintained  that  this  normalization  cannot  be  made  locally  at  each  step  in  the 
dynamic  programming  process,  since  the  minimization  process  is  made  locally  based  on 
cost  for  each  path  only  [168]. 

In  this  research,  however,  the  author  has  done  just  exactly  that  -  normalized  locally, 
based  on  the  total  cost  and  number  of  associations  in  each  candidate  predecessor  path. 
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The  point  here  is  that  dynamic  programming  does  not  restrict  the  user  to  make  local 
decisions  based  only  on  cost  to  a  given  point  [71:12],  Information  other  than  path  cost 
can  be  retained  and  used  for  later  decisions.  That  being  said,  the  concept  of  performing 
an  addition  and  a  division  for  each  cost  calculation  is  a  significant  deviation  from  usual 
dynamic  programming  practice,  and  not  one  to  be  used  lightly.  However,  represented  in 
usual  dynamic  programming  form  as  a  sum  of  costs  at  each  stage,  path  normalization  can 
be  considered  simply  as  a  case  in  which  incremental  costs  at  each  stage  are  a  (complicated) 
function  of  the  path  to  that  point,  and  philosophically,  dynamic  programming  practice 
provides  for  this  case. 

Referring  back  to  Eqn.  (2.29),  the  process  of  path  cost  computation  with  local  length 
normalization  can  be  written  (and  executed  in  the  otherwise  usual  fashion)  as  shown  below 
in  Eqn.  (5.15):  this  is  essentially  Eqn.  (15)  of  [168]. 

0.(0,)  =  »>)  +  (*- TmfP,(C.-))  (5.15) 

k 

where: 

0*  =  [aj ,  6,]  is  the  k-th  element  in  a  sequence  of  allowable  associations  of  elements 
from  sequence  A  with  elements  of  sequence  B,  this  particular  association  being  between 
element  a,  and  element  bt 

Ck  =  {ci>  c2,  c3, . . . ,  c*},  the  minimum  normalized  cost  sequence  of  associations  lead¬ 
ing  to  and  including  association  ck 

d(ck )  =  the  cost  or  distance  of  association  ck ,  i.e.,  the  distance  in  some  metric  between 
element  a7  and  element  fej 

Dn(Ck)  =  the  normalized  cost  of  reaching  and  accomplishing  association  ck  by  the 
minimum  normalized  cost  sequence  of  allowable  associations 

D(Ck)  —  kDn(Ck)  =  the  total  cost  of  reaching  and  accomplishing  association  ck  by 
the  minimum  normalized  cost  sequence  of  allowable  associations 

The  key  concept  to  keep  in  mind  is  the  “Principle  of  Optimality”,  which,  reworded 
somewhat  from  the  discussion  in  Sect.  2.4,  says  that  dynamic  programming  will  yield 
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True  Path,  A  to  D:  Mean  Cost  0.8 
False  Path,  B  to  D:  Mean  Cost  0.85 


True  Path,  A  to  C 
Cost  100 
Elements  50 
Mean  Cost  2.0 


True  Path,  C  to  D 
Cost  20 
Elements  100 


D 


False  Path,  B  to  C 
Cost  150 
Elements  100 
Mean  Cost  1.5 


SPACE  OF  WARPING 
PATH  ASSOCIATIONS 


Figure  5.25.  Violation  of  the  Principle  of  Optimality  From  Local  Path  Normalization 

the  optimal  path,  when  the  locally  optimal  path  (during  the  matching  process)  lies  on  the 
globally  optimal  path.  Now,  for  a  local  normalization  process,  it  is  very  simple  to  construct 
a  set  of  conditions  in  which  the  locally  optimal  (minimum  average)  path  does  not  lie  on  the 
globally  optimal  path.  Fig.  5.25  shows  such  a  case  -  note  in  the  figure  that  at  point  C,  a 
dynamic  programming  algorithm  using  normalized  path  cost  would  select  the  “false”  path 
from  B  to  C  as  the  predecessor  route,  while  in  fact  the  “true”  path  in  a  globally  optimal 
sense  is  from  A  to  C,  and  on  to  point  D.  Thus,  local  normalization  can  lead  to  violations 
of  the  Principle  of  Optimality. 

However,  in  the  particular  application  discussed  here,  we  expect  that  in  a  particularly 
important  case,  the  local  average  path  cost  will  equal  the  global  average  path  cost.  This 
occurs  when  three  conditions  are  satisfied:  (1)  the  observations  and  library  map  correspond 
to  the  same  target  model,  (2)  the  aspect  angle  transition  rate  (from  kinematics,  generally) 
used  to  define  the  map  is  close  to  the  true  one  which  generated  the  observations,  and  (3)  the 
signature  noise  process  statistics  are  constant  over  the  aspect  angles  and  times  of  interest. 
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Now  condition  (1)  will  apply  when  we  are  making  a  correct  target-model  association,  for 
a  well- modelled  target.  Condition  (2)  applies  when  we  have  a  good  kinematic  estimate  of 
aspect  angle  rate.  Finally,  as  noted  in  Sect.  5.3.4,  Condition  (3)  appears  to  apply  to  HRR 
radar  signatures  in  particular  and  perhaps  others  in  general. 

In  other  words,  when  making  a  correct  target-to-model  association  with  good  track¬ 
ing  information,  local  path  normalization  may  not  change  our  answers  significantly.  The 
benefit  of  normalization  is  that  it  allows  us  to  demonstrate  a  “contiguous”  path  matching 
concept  in  this  scenario  -  the  disadvantage  of  normalization  is  the  computational  cost  of 
the  division  operations.  As  noted  above,  normalization  is  essential  to  allow  the  algorithm 
to  make  a  correct  match  on  the  true  (unknown  a  priori)  target  model.  What  happens  for 
incorrect  target  associations  is  irrelevant ,  as  long  as  the  matching  path  cost  is  higher  than 
for  the  true  target  model. 

In  any  case,  we  must  observe  that  in  speech  processing,  obedience  to  classical  rules 
for  dynamic  programming  sequence  comparison  frequently  leads  to  matches  that  are  not 
“optimal”  in  a  practical  sense  [168].  This  is  where  the  “art”  in  the  “art  and  theory 
of  dynamic  programming”  [71]  has  to  be  applied  [181].  The  author’s  use  of  local  path 
normalization  is  such  an  example,  and  it  worked  as  intended,  as  we  shall  see  in  Sect.  5.6. 

Closing  this  discussion  on  path  normalization,  it  is  important  to  emphasize  that  the 
desire  for  normalization  was  only  driven  in  this  case  by  the  high  ratio  of  library  aspect  angle 
cells  to  measurements.  This  ratio  is  a  function  of  the  desired  discretization  of  aspect  angle, 
the  signature  sampling  interval,  and  the  target  turn  rate.  Under  conditions  where  the  ratio 
of  library  aspect  cells  to  measurements  is  lower,  normalization  will  not  be  required,  as  it 
is  not  for  conventional  applications. 

Finally,  recall  that  in  Sect.  3.8.2,  it  was  noted  that  conventional  sequence  comparison 
may  have  a  potential  for  reduced  computational  requirements  in  comparison  with  an  al¬ 
gorithm  with  Larson  and  Peschon-type  continuity  constraints,  since  conventional  sequence 
comparison  algorithms  tend  to  limit  the  number  of  predecessor  points  severely.  In  the  im¬ 
plementation  discussed  here,  this  potential  is  reduced  by  the  need  to  normalize,  or  divide 
by  the  total  number  of  associations  along  each  path. 
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5. 5. 5. 6  Classical  Sequence  Comparison  -  Two  Dimensional  (2-D  Warp). 
The  2-D  Warp  Algorithm  uses  the  same  set  of  trajectories  defined  for  the  one- dimensional 
case,  but  local  continuity  constraints  allow  the  optimum  path  to  move  from  one  trajectory 
to  its  right  or  left  neighbor  (see  Fig.  3.8).  Other  factors  Me  as  for  the  one- dimensional 
case. 

Note  that  for  any  given  aspect  single  cell  in  any  window,  we  must  now  consider  up 
to  six  possible  predecessor  paths  -  three  from  adjacent  cells  in  the  same  window  and  three 
from  cells  in  the  previous  window.  The  dotted  line  labeled  “2-D”  in  Fig.  3.8  showed  how  a 
particular  association  trajectory  might  proceed.  The  path  length  normalization  process  is 
just  as  in  the  previous  section.  Thus,  we  have  increased  the  dimensionality  of  the  problem, 
but  allowed  the  algorithm  to  respond  to  “off-nominal”  path  conditions  where  the  true  path 
is  not  parallel  to  the  nominal  or  kinematic  path. 

Like  other  “tuning”  issues  associated  with  all  of  these  dynamic  programming  se¬ 
quence  comparison  algorithms,  this  increased  “flexibility”  has  pros  and  cons,  as  implied  in 
Sect.  3.7.2.  From  the  pro  perspective,  loosening  kinematic  restrictions  on  signature  match¬ 
ing  will  help  an  algorithm  to  find  the  best  match  on  the  true  (unknown  a  priori)  target 
model,  if  the  kinematic  state  estimate,  or  basis  for  restrictions,  was  significantly  incorrect. 
From  the  con  perspective,  loosening  kinematic  restrictions  will  tend  to  allow  an  algorithm 
to  find  an  improperly  close  match  on  an  incorrect  target  model.  Limiting  the  latter  effect, 
of  course,  was  precisely  our  original  objective. 

In  any  case,  where  kinematic  information  provides  an  accurate  estimate  of  the  direc¬ 
tion  of  aspect  angle  change,  we  expect  that  the  one  dimensional  algorithm  will  outperform 
the  two  dimensional  one  -  the  two  algorithms  will  provide  essentially  the  same  answer 
for  the  true  (unknown  a  priori )  target  model,  but  the  flexibility  of  the  two  dimensional 
algorithm  will  allow  it  to  find  an  undesirably  close  match  on  some  incorrect  target  model. 
Conversely,  if  the  true  and  kinematically-estimated  aspect  angle  paths  cross  (are  not  par¬ 
allel),  the  two  dimensional  algorithm  will  have  a  better  chance  of  “finding  home”  on  the 
true  target  model  -  the  lower  cost  it  achieves  here  may  offset  any  penalties  incurred  by 
finding  an  improperly  close  match  on  an  incorrect  target  model. 
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5.5.6  Conclusion.  The  material  in  this  section  has  covered  the  operation  of  the 
dynamic  programming-based  sequence  comparison  techniques  and  conventional  or  ideal 
algorithms  against  which  they  will  be  evaluated.  It  should  be  clear  that  the  various  algo¬ 
rithms  have  competing  advantages  and  disadvantages  with  respect  to  optimality,  compu¬ 
tational  burden,  etc.  In  the  next  section,  we  will  evaluate  the  relative  performance  of  these 
algorithms,  using  both  conventional  performance  measures  and  the  generalized  ambiguity 
function,  as  discussed  in  Sect.  3.11. 

5.6  Research  Results  and  Discussion 

5.6.1  Overview.  The  following  subsections  discuss  key  observations  regarding 
the  results  of  simulations  constructed  as  outlined  in  the  previous  part  of  this  chapter. 
Additional  equations,  results,  and  observations  regarding  the  operation  of  the  kinematic 
smoother  per  se  are  given  in  App.  C. 

5.6.2  Kinematic  State  Estimation.  For  this  research,  the  nine  state  extended 
Kalman  filter  discussed  in  Sect.  5.5.2  was  implemented  using  the  “Multimode  Simulation 
for  Optimal  Filter  Evaluation”  (MSOFE)  software  package  [46].  The  smoother  and  target 
recognition  algorithms  were  implemented  separately  using  a  file  structure  developed  from 
that  which  links  MSOFE  with  the  associated  plot  postprocessing  routine,  “MPLOT”. 

Fixed  lag  smoother  “lags”  of  two  to  three  seconds  were  found  to  be  entirely  adequate  - 
estimates  of  aspect  angle  and  rate  were  established  with  adequate  quality  to  permit  the 
sequence  comparison  algorithms  to  work  as  expected.  Fig.  5.26  shows  extended  Kalman 
filter  and  smoother  performance  in  estimating  one  inertial  component  of  target  acceleration 
over  20  runs,  where  the  true  target  acceleration  is  two  g’s  (64  ft/sec2)  during  the  period  from 
three  to  eleven  seconds  and  zero  elsewhere.  The  upper  solid  curve  is  mean  extended  Kalman 
filter  error  (similarly  bounded  on  either  side  by  curves  for  the  mean  extended  Kalman 
filter  error  +/-  one  standard  deviation),  while  the  lower  solid  curve  is  mean  smoother  error 
(bounded  on  either  side  by  curves  for  the  mean  smoother  error  +/-  one  standard  deviation). 
The  smoother  delay  here  is  2.0  seconds  -  the  smoothed  estimate  (curve  number  two)  is 
available  from  2.0  to  11.9  seconds,  using  filter  information  out  to  13.9  seconds  (the  filter 
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Figure  5.26.  Acceleration  Estimation  Error:  Mean  +/-  One  Standard  Deviation 

runs  on  until  15.0  seconds).  Note  that  the  filter  starts  with  true  and  estimated  acceleration 
equal  to  zero  -  hence  the  zero  error  at  the  start  of  the  run. 

As  noted  earlier,  although  the  optimal  smoother  did  an  excellent  job  of  correcting 
extended  Kalman  filter  state  estimates  as  shown  here,  acceleration  estimates  were  still 
too  noisy  from  measurement  event  to  measurement  event  to  provide  smooth  aspect  angle 
estimates,  particularly  in  state  directions  where  insufficient  true  acceleration  made  use  of 
the  optimal  smoother  pointless  [154:11].  Also,  the  smoother-derived  acceleration  estimates 
were  rather  more  noisy  than  desired  for  identifying  steady-state  acceleration  conditions, 
which  indicate  target  turn  events  for  recognition.  Nominally,  a  steady-state  turn  event  was 
identified  when  estimated  target  acceleration  in  the  target  body  frame  remained  within 
+/-  5  ft /sec2  of  a  running  average  over  a  one-second  period.  To  meet  these  stringent 
smoothness  conditions,  second  degree  polynomials  were  fitted  to  the  filter /smoother  po¬ 
sition  estimates,  and  differentiated  twice  to  obtain  an  acceleration  estimate  with  error 
magnitudes  that  closely  follow  the  mean  smoother  value  in  Fig.  5.26.  Additional  plots 
showing  filter,  smoother,  and  curve  fit  results  are  provided  in  App.  C. 
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The  disadvantages  of  smoothing  are  added  processing  and  the  fact  that  our  target 
information  is  no  longer  real  time.  In  general,  we  found  that  a  high  quality  (+/-  20%) 
estimate  of  the  target  acceleration  was  obtained  with  a  four-second  delay  -  two  seconds  for 
the  fixed-lag  smoother  and  two  seconds  for  polynomial  curve  fitting.  Following  onset  of  a 
major  maneuver,  2-3  more  seconds  of  delay  are  desirable  to  identify  steady  state  conditions 
(note  how  the  fixed  lag  smoother  mean  error  curve  in  Fig.  5.26  begins  to  level  out  near  the 
five  second  point).  In  any  case,  as  shown  in  Fig.  5.26,  for  a  2-g  turn  lasting  as  little  as  eight 
seconds,  the  target  acceleration  can  be  estimated  with  high  confidence  for  approximately 
five  seconds  (i.e.,  from  approximately  the  5  second  point  to  the  10  second  point).  As  we 
will  show  below,  the  advantage  accrued  in  better  state  estimates  can  be  well  worth  the 
wait  and  processing,  particularly  for  turning  accelerations  in  excess  of  1  g. 

Once  the  target  velocity  and  acceleration  states  are  estimated  and  assumed  to  be  in 
steady  state  relative  to  the  target  body  frame,  calculation  of  target-sensor  aspect  angle 
and  aspect  angle  rate  are  straightforward  for  any  set  of  assumptions  on  target  control 
parameters,  as  shown  in  Sect.  5.5.3.  Even  for  the  relatively  large  kinematic  measurement 
errors  modelled  in  this  research,  after  smoothing  and  curve  fitting,  kinematically-based 
aspect  angle  estimates  were  never  observed  to  lie  more  than  ten  degrees  from  the  true 
(coordinated  turn)  figure.  Aspect  angle  rates  were  never  more  than  25%  in  error.  Even 
with  high  quality  kinematic  measurements,  aspect  angles  defined  from  filtering  without 
smoothing  w ere  often  subject  to  much  larger  errors,  particularly  at  the  onset  of  turn  events. 

For  example,  note  in  Fig.  5.26  that,  at  the  4-second  point,  the  mean  error  in  the 
filter-estimated  acceleration  is  approximately  50  ft /sec2,  while  the  mean  error  in  the 
smoother-estimated  acceleration  is  only  about  15  ft /sec2.  Therefore,  depending  on  the 
sensor  position,  an  aspect  angle  estimate  based  on  the  corresponding  mean  filter  acceler¬ 
ation  estimate  could  be  in  error  by  as  much  as  40  degrees  -  the  difference  between  roll 
angles  required  to  achieve  (1)  64.4(true)  -  50(error)wl5  ft/sec2  and  (2)  64.4  ft/sec2  (twog’s, 
true)  horizontal  acceleration  planar  turns.  In  comparison,  a  smoother-derived  aspect  angle 
estimate  based  on  the  corresponding  mean  smoother  acceleration  estimate  of  64.4(true)  - 
15(error)=s50  ft/sec2  would  be  in  error  by  only  about  6  degrees.  (From  the  equation  in 
Fig.  5.24,  for  a  lateral  acceleration  estimate  of  50  ft /sec2  in  a  level,  planar  turn,  we  obtain  a 
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roll  angle  of  arctan(50/32.2)  =  57.38  deg,  versus  the  figure  of  arctan(64.4/32.2)  =  63.43  deg 
for  the  actual  two-g  lateral  acceleration.) 

5.6.3  Recognition  Algorithm  Performance.  Typical  generalized  ambiguity  func¬ 
tion  (GAF)  outputs  obtained  in  this  research  are  shown  in  Figs.  5.27  and  5.28.  The  dotted 
vertical  lines  on  each  figure  indicate  parameter  (target)  interpolation  values  for  which  like¬ 
lihood  functions  were  defined,  and  spline  curve  fits  connect  the  mean  function  values  to 
provide  the  curves  shown.  Relevant  target  trajectory  parameters  are  shown  in  each  figure. 
As  in  Chapter  IV,  the  nominal  sensor-to-target  range  is  100,000  feet. 

Note  in  each  set  of  generalized  ambiguity  functions  that  the  Independent  Look  (IL) 
algorithm  defines  the  upper  bound  on  performance  (worst),  and  the  Perfect  Knowledge  of 
Aspect  (PKA)  algorithm  defines  the  lower  bound  (best,  but  unattainable  in  practice).  The 
Fixed  Bound  (FB)  algorithms  provide  significantly  improved  separation  from  the  IL  result, 
but  the  algorithms  which  fuse  filter/smoother-provided  observed  kinematic  information 
provide  separation  equal  to  or  better  than  that  of  the  FB  algorithm  in  each  of  these 
cases.  Note  that  the  fixed  bound  extent  is  increased  in  Fig.  5.28  to  allow  the  algorithm  to 
follow  the  higher  turn  rate  of  this  scenario.  In  general,  with  good  kinematic  estimates  of 
aspect  angle  state,  the  matching  algorithm  performance  improves  as  the  level  of  kinematic 
restriction  increases.  Thus,  the  order  of  improving  performance  is  expected  to  be  IL,  FB, 
2-D  Warp,  1-D  Warp,  L&P,  and  PKA. 

Observe  the  highly  nonlinear  shape  of  the  GAFs  in  Fig.  5.28  (and  of  the  PKA  GAF 
in  Fig.  5.27).  Our  results  have  shown  that  interactions  between  the  HRR  radar  sweep- 
to-sweep  comparison  metric  and  the  changes  in  the  parameter  space  due  to  the  morphing 
process  can  create  apparently  anomalous  results  -  e.g.,  cases  where  the  measurements  from 
an  F-4  were  closer  in  Mahalanobis  metric  sense  to  sweeps  from  the  MIG  than  they  were 
to  sweeps  yielded  by  an  interpolated  target  only  25%  removed  from  the  F-4.  These  cases 
resulted  from  the  relative  motion  of  scatterers  during  the  morphing  process,  and  were 
found  to  be  physically  reasonable  after  investigation.  Modified  morphing  rules  can  resolve 
these  anomalies  if  desired  -  transitions  in  parameter  space  “between”  two  target  parameter 
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4  X  AXIS:  MORPHING  FRACTION  * 

F-4  (TRUE)  MIG-21 

Figure  5.28.  Generalized  Ambiguity  Function  for  Case  2.  Mahalanobis  metric,  6  dBsm 
std.  dev.  noise. 
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sets  or  locations  can  be  made  along  any  number  of  paths  to  yield  any  number  of  different 
generalized  ambiguity  function  shapes. 

Although  not  shown  here,  the  moments  of  the  distributions  about  the  mean  values 
giving  the  GAF  are  small  (e.g.,  standard  deviations  of  3-4  units).  Thus,  the  parent  target 
classes  are  readily  separable  in  the  feature  space  ( simulated  HRR  radar)  and  metric  (Ma- 
halanobis)  used  here  with  any  of  the  algorithms  shown  -  the  intent  here  is  to  show  better 
separation  with  fusion  of  motion  information.  This  effect  is  particularly  noticeable  for 
pseudo-targets  close  to  the  parent  (F-4)  which  provides  the  measurements.  In  this  region, 
the  FB  algorithm  provides  little  improvement  over  an  IL  algorithm. 

In  fact,  as  discussed  in  Sect.  5.3.5,  considerable  effort  was  required  in  this  research 
to  define  target  models,  trajectories,  and  noise  levels  such  that  any  algorithm  would  fail 
to  indicate  the  correct  target.  For  any  pair  of  targets,  trajectory,  and  kinematic/signature 
noise  realization,  failure  of  an  algorithm  is  defined  as  the  algorithm  indicating  a  higher 
likelihood  (of  generating  the  observed  signatures)  for  an  incorrect  target  model  than  for 
the  correct  (unknown  a  priori)  target  model.  The  scenario  that  ultimately  provided  the 
desired  stressing  performance  was  a  combination  of  a  trajectory  like  that  in  Figs.  4.1 
and  5.27  with  scatterer-augmented  targets  and  high  noise  (9  dBsm  std.  dev.),  producing 
signatures  like  those  in  Figs.  5.17  through  5.19.  Fig.  5.29  expresses  the  results  of  this 
scenario  in  classical  terms  of  percent  correct  recognition,  in  a  format  similar  to  that  used 
for  the  generalized  ambiguity  function  curves.  Note  that  the  PKA  algorithm  provided 
100%  correct  recognition  in  every  case,  and  so  is  not  shown  explicitly. 

Fig.  5  29  shows  the  percentage  of  correct  recognition  for  the  six  algorithms  of  interest, 
defined  for  target  parameter  (morph)  values  of  0.04  to  0.25,  where  morph  value  0.0  (the  true 
target)  is  the  scatterer-augmented  MIG-21,  and  morph  value  1.0  is  the  scatterer-augmented 
and  down-scaled  SU-22.  Clearly,  for  pseudo-targets  at  small  morph  fractions,  with  one 
exception,  the  sequence  comparison  techniques  out  perform  the  Independent  Look  (IL) 
algorithm.  The  “optimum”  Larson  and  Peschon  technique  does  particularly  well,  because 
its  explicit  aspect  angle  transition  likelihoods  do  the  best  job,  relative  to  other  algorithms, 
of  restricting  it  from  finding  an  improperly  high  likelihood  match  on  an  incorrect  model. 
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0.0  0.4  0.8  0.12  0.16  0.20  0.25 

t  MIG-21  w/  scats.  (SU-22  w/  scats.  =  1.0) 

X  AXIS:  MORPHING  FRACTION 

Figure  5.29.  Percent  Correct  Recognition  for  Case  3.  Mahalanobis  metric,  9  dBsm  std. 
dev.  noise. 
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Fig.  5.30  shows  the  generalized  ambiguity  functions  corresponding  to  Fig.  5.29.  The 
increased  ambiguity  is  evident  in  the  much  more  closely  grouped  generalized  ambiguity 
function  values  for  this  case. 

Since  the  generalized  ambiguity  function  plots  shown  thus  far  have  placed  the  true 
target  or  parameter  set  on  the  left  boundary,  they  are  ill-defined  to  demonstrate  the  curva¬ 
ture  around  that  parameter  point  which  wc  associated  with  the  Cramer-Rao  lower  bound 
(CRLB)  in  Sect.2.7.  Accordingly,  the  same  set  of  targets  and  algorithms  was  used  in  a 
Monte  Carlo  set  for  which  the  “12%  Morph”  is  the  true  target,  or  origin  of  measurements. 
The  generalized  ambiguity  function  plot  resulting  from  this  case  is  shown  in  Fig.  5.31. 
Note  that  the  increase  in  curvature  around  the  true  target  location  is  clearly  evident  as 
we  proceed  from  the  Independent  Look  algorithm  to  the  Perfect  Knowledge  of  Aspect  al¬ 
gorithm.  This  increased  curvature  implies  progressively  lower  Cramer-Rao  bounds  on  the 
parameter  estimates  for  the  respective  recognition  algorithms,  bounded  by  that  for  the 
PKA  algorithm,  which  is  unattainable  in  practice. 

Figure  5.32  demonstrates  that  the  high  ambiguity  evidenced  in  the  previous  case  was 
in  large  part  a  function  of  trajectory  or  aspect  angle,  since  that  scenario  caused  the  target 
scatterers  to  be  highly  “concentrated”  in  range.  In  Fig.  5.32,  we  have  used  the  same, 
scatterer-augmented  target  set  with  a  trajectory  in  which  the  sensor-to-target  vector  is 
essentially  along  the  target  longitudinal  axis,  and  the  scatterers  are  separated  in  range. 
The  target  signatures  are  now  sufficiently  distinct  from  one  another  that  the  percentage  of 
correct  recognition  is  again  100%  for  all  algorithms  (therefore,  no  “percent  correct”  plot 
is  given),  and  the  generalized  ambiguity  functions  show  much  higher  variation  from  left  to 
right. 

In  Figure  5.33,  we  return  to  a  stressing  trajectory  similar  to  that  of  Fig.  5.30,  but 
with  a  non-planar  turn  and  models  that  are  fundamentally  more  distinct  in  structure  - 
a  YAK-28  and  a  B-737  scaled  to  the  approximate  size  of  the  YAK  (this  B-737  model 
is  comparable  in  many  respects  to  a  small  business  class  jet).  For  the  morph  fractions 
investigated,  we  again  see  the  usual  trends  for  reduction  in  ambiguity.  The  targets  are 
again  sufficiently  distinct  that  the  percentage  of  correct  recognition  is  100%  in  each  case, 
and  no  plot  of  those  results  is  required. 
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*  MIG-21  w/  scats.  (TRUE)  (SU-22  w/scats.  =  1.0) 

X  AXIS:  MORPHING  FRACTION 

Figure  5.30.  Generalized  Ambiguity  Function  for  Case  3.  Mahalanobis  metric,  9  dBsm 
std.  dev.  noise. 
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Figure  5.31.  Generalized  Ambiguity  Function  for  Case  3  (Modified).  Mahalanobis  metric, 
9  dBsm  std.  dev.  noise. 
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MIG-21  (TRUE)  SU-22 

(w/  scat.)  (w/  scat.) 


Figure  5.32.  Generalized  Ambiguity  Function  for  Case  4.  Mahalanobis  metric,  9  dBsm 
std.  dev.  noise. 
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A  (Note:  B-737  here  scaled  to  YAK-28  length)  ^ 

YAK-28  (TRUE)  x  AXIS;  MORPHING  FRACTION  8737 


Figure  5.33.  Generalized  Ambiguity  Function  for  Case  5.  Mahalanobis  metric,  9  dBsm 
std.  dev.  noise. 
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In  Figures  5.34  and  5.35,  we  return  to  a  case  in  which  the  origin  targets  are  sufficiently 
ambiguous  to  cause  frequent  incorrect  identifications  with  the  Independent  Look  algorithm. 
In  this  scenario,  the  origin  or  “endpoint”  targets  are  an  F-4  with  added  scatterers,  with  and 
without  4  X  2000  lb.  bombs  (two  bombs  under  each  wing),  respectively.  The  trajectory 
is  that  of  Fig.  5.33.  As  Fig.  5.35  graphically  illustrates,  this  case  is  extremely  ambiguous. 
Unique  among  cases  examined  in  this  research,  these  targets  were  so  ambiguous  that 
even  perfect  knowledge  of  origin  aspect  angle  was  insufficient  to  guarantee  100%  correct 
identification,  as  shown  in  Fig.  5.34. 

The  classical  sequence  comparison  or  dynamic  time  warping-based  techniques  in  gen¬ 
eral  have  not  performed  as  well  as  Larson  and  Peschon  algorithms,  and  their  performance 
was  particularly  poor  in  this  last,  most  stressing  case.  A  detailed  review  of  results  shows 
clearly  that  this  fact  is  largely  due  to  a  convenient  artificiality  in  this  test  which  real- 
world  data  may  not  reflect.  Specifically,  the  measurement  values  in  these  tests  are  drawn 
from  discrete  locations,  generally  separated  by  two  or  more  aspect  angle  “cells”,  or  ap¬ 
proximately  1.5  degrees.  In  defining  an  optimum  low-cost  path,  the  classical  sequence 
comparison-based  techniques  used  here  are  forced  to  define  a  contiguous  path  across  the 
aspect  angle  space  -  i.e.,  measured  signatures  must  in  general  be  matched  to  library  values 
perhaps  considerably  different  from  their  origin  locations,  even  on  the  correct  model. 

This  factor  tends  to  prevent  the  classical  sequence  comparison  algorithms  from  find¬ 
ing  as  low  an  association  cost  on  the  true  (unknown  a  priori )  model  as  do  other  algorithms. 
With  the  likelihood  value  variations  induced  by  noise,  then,  an  incorrect  model  occasion¬ 
ally  provides  a  higher  likelihood  than  the  true  model,  causing  the  classical  algorithm  to 
fail.  This  is  somewhat  more  true  in  the  case  of  the  two-dimensional  sequence  compari¬ 
son  algorithm,  which  is  allowed  to  find  rather  higher  likelihood  (lower  cost)  matches  on 
incorrect  models,  while  the  match  on  the  correct  model  tends  to  be  the  same  as  for  the 
one- dimensional  algorithm.  The  nature  of  the  Larson  and  Peschon  approach,  on  the  other 
hand,  allows  it  to  find  high  likelihood  (low  cost)  matches  on  the  true  model  and,  due  to 
its  explicit  kinematic  restrictions,  constrains  it  to  lower  likelihood  (higher  cost)  matches 
on  the  incorrect  model. 
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Figure  5.34.  Percent  Correct  Recognition  for  Case  6.  Mahalanobis  metric,  9  dBsm  std. 
dev.  noise. 
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F-4  w/  scats.  &  4  x  2000  lb.  bombs  (F-4,  0  bombs  =  1.0) 
(TRUE)  x  AXIS.  MORPHING  FRACTION 


Figure  5.35.  Generalized  Ambiguity  Function  for  Case  6.  Mahalanobis  metric,  9  dBsm 
std.  dev.  noise. 
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This  effect  can  be  mitigated  by  providing  for  continuity  constraints  which  allow  the 
classical  sequence  comparison  algorithms  to  “skip”  aspect  angle  cells  with  poor  matching 
likelihoods,  or  by  basing  the  output  likelihood  on  the  “best”  match  for  each  signature 
along  the  contiguous  path  (the  latter  choice  is  used  in  the  previous  results).  Recall  that, 
in  a  generic  sense,  the  Larson  and  Peschon-based  algorithms  are  classical  sequence  com¬ 
parison  algorithms  that  skip  as  many  library  cells  as  required  to  match  n  observations  to 
n  library  cells.  An  essential  factor  to  remember,  however,  is  that  allowing  any  dynamic 
programming  algorithm  to  “skip”  states  brings  an  immediate  increase  in  dimensionality 
and  computational  cost.  Note  in  Fig.  3.8  that,  for  one- dimensional  sequence  comparison 
with  the  continuity  constraints  shown,  a  given  library  cell-measurement  association  has  at 
most  two  predecessors  -  the  two-dimensional  case  has  at  most  six  predecessors.  Larson 
and  Peschon-based  algorithms  are  not  so  limited. 

In  any  case,  however,  real-world  signatures  do  not  arise  exclusively  from  one  aspect 
angle  state  location,  but  rather  are  the  “blurred”  result  of  observations  over  small  con¬ 
tiguous  extents  of  aspect  angle.  It  is  expected  that  this  blurring  would  tend  to  reduce  the 
performance  differential  between  Larson  and  Peschon-type  and  classical  sequence  compar¬ 
ison  algorithms.  Consider  that  dynamic  time  warping  works  in  speech  processing  because 
(1)  the  “feature  observables”  extracted  from  speech  are  reasonably  continuous,  (2)  they  can 
be  extracted  with  relatively  little  noise,  (3)  suitable  metrics  exist  to  distinguish  differences 
between  their  realizations,  and  (4)  the  feature  observable  functions  can  be  discretized  or 
partitioned  over  time  at  intervals  which  preserve  “closeness”  in  the  chosen  metric  sense 
between  adjacent  elements  in  the  partitioned  functions.  The  ease  with  which  classical 
sequence  comparison  can  be  employed  will  depend  upon  these  factors  as  well,  and  more. 

We  must  keep  in  mind  that  the  nature  and  behavior  of  feature  observables  in  multi¬ 
sensor  fusion  vary  widely  with  each  sensor  and  its  phenomenology.  Relative  performance 
of  any  one  algorithm  in  different  feature  spaces  is  difficult  to  predict.  We  may,  for  example, 
encounter  a  signature  or  feature  space  situation  like  that  hypothesized  in  Sect.  3.6.6,  in 
which  Larson  and  Peschon  algorithms  skip  highly  unlikely  aspect  angle  state  locations  to 
find  a  low  cost  match  on  a  target  model  that  is  generally  unlikely  to  have  produced  the 
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observed  measurements.  In  that  scenario,  a  Larson  and  Peschon-type  algorithm  might 
perform  poorly. 

In  at  least  one  test  set  using  the  slide  distance  metric,  it  appears  that  an  effect  of 
this  type  was  observed  -  basically,  for  the  scenario  of  Fig.  5.32,  the  test  was  repeated 
using  the  slide  distance  metric  and  noise  of  5  dBsm  standard  deviation.  The  Independent 
Look  and  Fixed  Bound  algorithms  exhibited  50%  correct  recognition  for  all  morphs,  or 
no  better  than  a  coin  toss,  while  the  1-D  and  2-D  Warp  algorithms  exhibited  60%  correct 
recognition  or  better  consistently  for  the  67%  and  75%  morphs.  Since  it  explicitly  includes 
transition  likelihoods,  the  Larson  and  Peschon  algorithm  was  not  used  in  this  test  per  se, 
but  a  version  of  the  Fixed  Bound  algorithm  “biased”  in  the  direction  of  motion  did  no 
better  than  the  standard  Fixed  Bound  algorithm.  It  appears  that  the  ability  to  skip  cells 
gave  even  Larson  and  Peschon-type  dynamic  programming  algorithms  the  ability  to  find 
very  low  cost  matches  for  this  metric  consistently  on  the  wrong  targets,  while  the  1-D  and 
2-D  Warp  algorithms  found  somewhat  higher  costs.  For  reasons  noted  later  in  this  chapter, 
however,  this  observation  should  not  be  taken  as  representative  of  the  performance  of  the 
slide  distance  metric  in  general. 

The  only  statement  that  can  be  made  with  certainty  is  that  restricting  signature 
associations  according  to  a  priori  information  (kinematic  or  otherwise),  if  done  properly 
(so  as  not  to  preclude  associations  required  by  likely  behavior  of  the  truth  model),  should 
improve  recognizer  performance,  where  incorrect  recognitions  are  caused  by  unlikely  asso¬ 
ciations  on  the  wrong  target  model.  Again,  the  expected  order  of  improving  performance 
corresponds  to  the  order  of  increasing  restriction  for  observed  kinematics:  IL,  FB,  2-D 
Warp,  1-D  Warp,  L&P,  and  PKA.  The  next  section  illustrates  how  these  algorithms  and 
their  kinematic  restrictions  lead  to  lower  likelihoods  for  incorrect  associations. 

5.6.4  Effects  of  Considering  Likely  Kinematics:  Restricting  Transitions.  In 
Sect.  2.6.2,  we  proposed  that  restricting  target  recognition  algorithms  to  kinematically- 
feasible  subsets  of  a  matching  function  domain  would  be  expected  to  decrease  ambiguity. 
Fig.  5.36  shows  why  progressive  domain  restriction  provides  better  separation  when  mea¬ 
surements  from  one  target  class  are  matched  to  the  library  for  another  (i.e.,  wrong )  class. 
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Figure  5.36.  ML  Aspect  Angle  Estimates  on  the  Wrong  Target 
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This  figure  represents  a  region  of  solid  angle  in  target  aspect  defined  by  the  union  of  six 
“windows”  or  aspect  tingle  bounds  for  any  one  measurement.  The  “brackets”  along  the 
right  side  of  this  diagram  show  how  the  windows  for  successive  measurement  events  may 
in  general  overlap  (window  two,  or  “W2”,  for  example,  extends  along- track  from  aspect 
angle  cell  row  3  to  row  15).  The  maximum  likelihood  aspect  angle  associations,  or  pose 
estimates  identified  by  several  algorithms  over  this  angular  extent  for  six  measurements 
are  shown  -  the  true  aspect  angle  is  shown  as  well. 

Note  the  erratic  aspect  sequence  selected  by  the  IL  processor.  Recall  that  the  IL 
processor  is  allowed  to  find  the  best  signature- to-library  match  (i.e.,  pose  estimate)  within 
the  appropriate  window  -  regardless  of  past  (or  future)  associations.  This  erratic  sequence 
therefore  gives  a  lower  matching  cost  (higher  signature  likelihood)  than  those  found  by 
algorithms  which  restrict  transitions  according  to  likely  or  observed  kinematics.  Next,  note 
the  still  rather  unlikely  sequence  selected  by  the  FB  algorithm.  This  algorithm  requires 
successive  pose  estimates  to  lie  within  the  bound  limit,  but  does  not  explicitly  penalize  the 
matching  cost  for  transitions  that  conflict  with  a  priori  expectations. 

The  DTW  and  L&P-based  algorithms  select  more  likely  (linear)  aspect  angle  paths, 
and  their  predilection  to  follow  kinematically-reasonable  paths  forces  a  higher  matching 
cost  (lower  likelihood)  than  given  by  IL  or  even  FB  algorithms  for  this  incorrect  model- 
to-target  association.  In  contrast,  when  measurements  were  matched  to  their  true  target 
class  of  origin,  the  different  algorithms  are  much  more  likely  to  associate  with  the  same 
(true)  aspect  angle  region,  although  the  IL  algorithm  may  still  give  unlikely  pose  estimate 
transitions,  as  we  will  shortly  see. 

Closing  our  discussion  of  Fig.  5.36,  consider  the  effect  of  using  a  pose  estimate  ex¬ 
tracted  from  this  figure  to  drive  the  dynamics  model  in  a  kinematic/aspect  filter  of  the 
kind  described  in  Chapter  IV.  If  some  signature  sequence-matching  algorithm  was  able 
to  provide  a  (zero  error)  pose  estimate  corresponding  to  points  T1  through  T6,  then  the 
kinematic/aspect  filter  associated  with  this  target  model  would  “fly”  properly,  exhibit¬ 
ing  reasonable  residuals,  and  so  on.  On  the  other  hand,  what  will  happen  if  the  filter  is 
provided  with  the  “Larson  and  Peschon”  pose  estimate?  Now  it  happens  that,  for  the 
trajectory  shown  here  (that  of  Fig.  5.27  et  al.),  error  in  the  along-track  pose  estimate 
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corresponds  closely  to  error  in  angle  of  attack,  and  error  in  the  cross-track  pose  estimate 
corresponds  closely  to  error  in  roll  angle.  Thus,  we  would  expect  on  the  order  of  two  to 
3.4  degrees  error  in  the  angle  of  attack  estimate,  and  seven  degrees  error  in  the  roll  angle 
estimate.  This  error  would  contribute  to  “conflict”  between  states  in  the  kinematic/aspect 
filter,  which  would  be  revealed  by  higher  residual  error  for  the  incorrect  model  than  for 
the  correct  model  (although,  for  this  case,  perhaps  not  as  large  a  difference  as  those  shown 
in  Chapter  IV). 

The  reader  may  observe  that  use  of  the  pose  estimate  sequence  from  the  Independent 
Look  processor  in  Fig.  5.36  could  have  an  even  more  severe  effect  on  the  operation  of  a 
kinematic/aspect  filter,  even  though  this  sequence  is  perhaps  closer  in  the  mean  to  the  true 
aspect  angle  sequence.  However,  even  on  the  correct  (unknown  a  priori )  target  model, 
the  Independent  Look  pose  estimate  sequence  can  show  wild  changes.  This  behavior  is 
observed  in  Fig.  5.37,  showing  results  from  the  first  run  made  for  Fig.  5.29  -  a  high  noise, 
high  ambiguity  case.  Arrows  to  show  transitions  for  the  dynamic  programming  algorithms 
are  not  shown,  since  these  associations  are  all  basically  in  the  same  area.  Note,  however, 
that  the  pose  estimate  sequence  for  the  Fixed  Bound  algorithm  still  does  not  conform  very 
well  to  the  true  sequence  -  the  Larson  and  Peschon  algorithm  follows  it  exactly,  and  the 
classical  sequence  comparison  algorithms  differ  only  in  minor  ways.  Clearly,  it  is  better  to 
extract  this  pose  estimate  after  “smoothing”  by  sequence  comparison  methods.  We  will 
return  to  this  subject  in  the  following  chapter. 

In  general,  the  improvement  from  kinematic  information  fusion  increases  with  the 
mean  aspect  angle  rate  or  g  level  of  the  target’s  turn.  As  turn  rate  increases,  physics 
limits  the  number  of  possible  aspect  angle  states  x°  (and  therefore  state  sequences 
and  we  can  limit  the  remaining  matching  domain  even  more  severely  to  (fewer)  sequences 
of  expected  length  and  direction.  For  the  FB  algorithm  (with  a  fixed  sampling  rate), 
however,  we  must  open  the  aspect  angle  bounds  to  give  it  any  chance  of  tracking  the 
nominal  aspect  rate  on  the  true  target.  This  increases  dimensionality  and  gives  it  a  greater 
chance  of  finding  an  improperly  high  likelihood  match  on  an  incorrect  target  model.  Other 
approaches  for  identifying  infeasible  aspect  angle  sequences  may  mitigate  this  problem, 
but  may  not  effectively  use  the  information  available  in  observed  kinematics. 
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CROSS-TRACK  (DEG.)  WINDOW  BOUNDS 


Figure  5.37.  ML  Aspect  Angle  Estimates  on  the  Correct  Target.  Note:  The  Larson 
and  Peschon  path  selected  the  true  location  in  each  case.  Selected  points 
for  other  paths,  if  not  shown  explicitly,  also  fall  on  the  “multiple”  point 
locations 
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Conversely,  as  turn  rate  decreases,  the  small  mean  aspect  angle  rate  available  to 
motion  fusion  algorithms  tends  to  produce  the  same  results  as  the  FB  algorithm,  which 
assumes  no  mean  rate,  and  can  use  small  bounds  when  a  small  mean  rate  exits.  For  a  zero- 
mean  turn  rate  estimate,  FB  algorithms  provide  an  effective  approach  -  this  is  simply  the 
limiting  case  of  the  L&P  algorithm  for  a  zero-mean,  uniform  p(x£+1  n  |  x£  n,  Z?a,o>,).  For 
an  aspect  angle  rate  known  to  be  zero,  conventional  decision  theoretic  recognition  is  most 
effective  -  this  is  in  turn  the  limiting  case  of  the  FB  algorithm  for  a  bound  of  zero  degrees. 
This  does  not  imply,  however,  that  the  decision  theoretic  classifier  should  be  able  to  pick 
best  successive  looks  from  different  aspect  angles  on  any  one  target  model  -  rather  that 
the  “best”  maximum  likelihood  estimate  of  class  and  pose  will  be  provided  by  the  single 
aspect  angle  cell  on  one  target  model  that  yields  the  highest  joint  likelihood  of  originating 
the  observed  signature  sequence.  As  motivated  in  Sects.  3.5  and  3.6.6,  however,  making  a 
class  membership  decision  using  this  maximum  likelihood  value  for  one  aspect  angle  may 
be  suboptimal  compared  to  a  maximum  a  posteriori  estimate  considering  all  a  priori  likely 
(but  constant  over  the  measurement  period)  aspect  angle  cells  for  each  class. 

5.6.5  Miscellaneous  Issues.  Effects  from  discretization  in  the  aspect  angle  space 
were  noticed  in  these  tests  when  measurement  signatures  were  not  drawn  from  the  library 
cell  locations,  but  rather  from  locations  in  between  library  cell  values.  Although  signa¬ 
tures  preserve  significant  characteristics  over  small  aspect  angle  changes,  the  Mahalanobis 
metric  used  in  these  tests  was  sensitive  even  to  small  signature  changes.  The  primary 
result  of  this  effect  was  a  reduction  in  the  effectiveness  of  sequence  comparison  algorithms 
in  general  and  the  1-D  classical  algorithm  in  particular,  since  these  were  restricted  from 
finding  an  adequately  high  likelihood  match  on  the  true  target  model  -  even  on  the  correct 
model,  poor  discretization  will  drive  the  the  Independent  Look  algorithm  to  look  far  afield 
for  a  minimum  cost  match.  Tuning  the  sequence  matching  algorithms  to  loosen  transition 
restrictions  improved  their  ability  to  find  low  cost  matches  on  the  true  model,  but,  as 
discussed  in  Sect.  3.7.2,  this  reduces  the  effectiveness  of  kinematic  restrictions  for  incorrect 
matches.  In  any  case,  the  Larson  and  Peschon-type  algorithm  retained  clear  superiority 
over  the  others.  We  should  expect  reduction  of  this  discretization  effect  with  finer  dis- 
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cretization,  a  less  sensitive  metric,  or  possibly,  a  different  feature  observable  space  (i.e.,  a 
space  which  exhibits  less  variation  as  a  function  of  aspect  angle). 

In  addition  to  the  Mahalanobis  metric,  the  General  Dynamics  slide  distance  metric 
was  also  applied  in  these  scenarios,  but  was  found  to  be  ineffective  in  this  simulated  feature 
space  for  showing  differences  between  the  various  algorithms.  With  few  exceptions  (one  of 
which  was  noted  above)  these  algorithms  seemed  to  do  equally  well  or  equally  poorly  with 
the  slide  distance  metric.  This  is  presumably  a  function  of  tuning,  since  the  GD  algorithm 
is  signature-library  specific,  as  discussed  in  Chapter  II,  and  the  slide  distance  algorithm 
was  not  tuned  for  these  artificial  signatures.  Note  that  the  tests  discussed  ^ect.  5.4 
were  conducted  under  entirely  different  rules  from  those  discussed  in  this  section  -  longer 
sequences,  more  differences  between  targets,  no  noise,  and  so  on.  In  any  case,  it  should  be 
clear  that  properly  limiting  the  domain  of  any  signature  matching  or  likelihood  function  in 
accordance  with  observed  kinematics,  using  any  signature-to-signature  metric,  should  not 
degrade  correct  recognition  events,  and  may  reduce  incorrect  recognitions  significantly. 

Finally,  the  reader  should  consider  the  value  of  analyzing  maximum  likelihood  tar¬ 
get  recognition  systems  with  generalized  ambiguity  functions  -  allowing  evaluation  of  the 
“curvature”  of  the  likelihood  function  around  its  design  point.  While  separation  at  targets 
of  interest  (i.e.,  points  of  interest  in  parameter  space)  is  the  key  design  criterion,  high 
curvature  around  the  design  point  of  each  likelihood  function  should  be  of  high  secondary 
interest.  Recall  from  Sect.  2.7  that  this  curvature  .  ectly  related  to  the  Cramer-Rao 
lower  bound  (CRLB)  for  the  estimator  used  to  dev-  ic  GAF  [154]:  practical  evaluation 
of  this  bound  using  our  approach  requires  one  to  gernr  *te  target  “morphs”  or  interpola¬ 
tions  arbitrarily  close  to  the  design  parameter  point  flt,  and  evaluate  the  behavior  of  the 
GAF  in  this  region. 

As  implied  by  the  behavior  of  the  generalized  ambiguity  functions  in  the  figures  shown 
(particularly  Fig.  5.31),  the  limiting  value  of  the  CRLB  for  these  estimators  is  evidently 
given  by  the  CRLB  to  be  found  in  this  fashion  for  the  PKA  algorithm  (i.e.,  joint  maximum 
likelihood  for  known  aspect  angle  over  time).  In  any  case,  the  figures  make  it  clear  that 
the  separability  of  anv  two  target  classes  depends  on  much  more  than  behavior  of  the  GAF 
around  the  true  target  parameter  point.  Thus,  the  concept  of  a  CRLB  in  this  construct 
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is  perhaps  not  of  greatest  interest  where  we  simply  wish  to  identify  a  set  of  measurements 
as  belonging  to  one  of  several  a  priori  known  points  in  some  parameter  space.  Rather, 
the  CRLB  may  be  most  useful  where  we  wish  to  perform  classical  parameter  estimation: 
for  example,  quantifying  the  extent  to  which  we  are  able  to  learn  the  optimum  location 
in  some  finite-dimensional,  model-based  target  parameter  space  to  represent  a  previously 
unclassified  real  target,  known  only  by  tracking  data  and  signature  sequences.  We  will 
return  to  this  subject  in  Chapter  VI. 

5. 7  Summary 

In  conclusion,  the  results  of  this  section  show  that  “motion  warping”  -  the  application 
of  dynamic  programming  sequence  comparison  in  moving  object  recognition  -  is  a  highly 
effective  concept  for  multisensor  fusion.  This  approach  exploits  the  joint  likelihood  of 
signatures,  conditioned  on  observed  kinematics,  to  reduce  ambiguity  in  recognition. 

The  “full”  or  optimal  Larson  and  Peschon  and  classical  sequence  comparison  ap¬ 
proaches  demonstrated  here  represent  significant  new  additions  to  the  previous  research  of 
Le  Chevalier  [136]  and  Mieras  [164,  165].  This  research  shows  that  all  of  these  algorithms 
are  simply  members  in  a  large  family  of  forward  dynamic  programming-based  sequence 
comparison  techniques.  The  final  answer  as  to  which  approach  is  better  is  expected  to  be 
a  function  of  the  particular  application.  Classical  performance  tests,  generalized  ambiguity 
function  analysis  and  consideration  of  computational  requirements  can  be  used  to  select 
the  best  technique  for  a  particular  object  class,  sensor,  and  so  on. 

The  results  given  here  indicate  that  applying  the  full  Larson  and  Peschon  algorithm 
provides  more  accurate,  less  ambiguous  recognition  than  either  suboptimal  “fixed  bound” 
Larson  and  Peschon- type  techniques  (similar  to  those  of  Le  Chevalier  and  Mieras)  or 
classical  sequence  comparison  approaches.  Under  demanding,  ambiguous  target  signature 
conditions,  the  Larson  and  Peschon  algorithm  best  exploits  “a  priorf’  information  on  target 
motion.  This  algorithm  demonstrated  the  ability  to  make  correct  target  recognitions  nearly 
100%  of  the  time,  when  correct  recognition  by  an  independent  look  processor  fell  to  80%  or 
less.  Suboptimal  Larson  and  Peschon-type  techniques  and  classical  sequence  comparison 
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approaches  generally  provided  performance  intermediate  between  that  of  “independent 
look”  and  optimal  Larson  and  Peschon  methods. 

Moreover,  the  results  of  this  chapter  have  demonstrated  a  new  approach  for  estab¬ 
lishing  bounds  on  the  performance  of  recognition  algorithms.  Treating  dynamic  object 
recognition  as  a  problem  in  parameter  estimation,  and  conducting  performance  analysis 
with  generalized  ambiguity  functions  allows  definition  of  Cramer- Rao  lower  bounds  for  the 
covariance  of  “parameter  estimates”  by  the  recognizer. 

In  sum,  these  results  provide  significant  new  directions  for  the  development  and 
analysis  of  dynamic  object  and  tactical  target  recognition.  They  can  be  used  alone  or  in 
combination  with  other  methods  proposed  in  this  dissertation  to  exploit  the  joint  likelihood 
or  syntax  of  observable  events.  The  following  chapter  will  discuss  directions  in  which  further 
development  might  proceed. 
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VI.  Further  Developments  Exploiting  Joint  Likelihood  in  Object  Recognition 

6. 1  Introduction 

The  author  believes  that  this  work  and  prior  contributions  by  Therrien  [211],  Le 
Chevalier  et  al.  [136,  135]  and  Mieras  et  al.  [164,  165]  have  only  begun  to  assess  the  poten¬ 
tial  inherent  in  syntactic  approaches  to  dynamic  object  recognition  and  other  multisensor 
fusion  applications  -  in  particular,  tactical  target  recognition.  The  author  has  shown  that 
syntactic  methods  have  great  potential  for  identifying  those  target  parameter  sets  with 
highest  joint  likelihood  of  generating  observed  kinematic  and  sensor  signature  events.  The 
purpose  of  this  chapter  is  to  discuss  particular  directions  and  tasks  for  extensions  to  this 
research. 

The  two  primary  directions  of  this  research  -  Steps  One  and  Two  as  proposed  in 
the  introduction  to  Chapter  III  -  were  the  use  of  (1)  conventional  multiple  model  pa¬ 
rameter  estimators  with  kinematic/aspect-angle  trackers  to  assess  the  joint  likelihood  of 
observed  kinematics,  conditioned  on  feature  observable  measurements,  and  (2)  dynamic 
programming-based  sequence  comparison  techniques  to  assess  the  joint  likelihood  of  mea¬ 
sured  feature  observables,  conditioned  on  kinematic  measurements.  The  most  important 
extension  required  to  this  research  is  to  investigate  Step  Three  -  a  new  estimator  structure 
which  combines  the  two.  This  is  the  principal  topic  of  the  first  section  below. 

6.2  Sequence  Comparison  Methods  for  Single  Object  Identification 

In  this  section,  we  concern  ourselves  with  the  problem  of  tracking  and  classifying  a 
single  dynamic  object,  i.e.,  the  case  in  which  all  measurements  are  unambiguously  asso¬ 
ciated  with  one  origin  object.  As  discussed  above  and  in  Sect.  3.9,  great  promise  exists 
for  object  recognition  algorithms  combining  (1)  classical  sequence  comparison  and/or  Lar¬ 
son  and  Peschon  approaches  for  sequential  signature  processing  with  (2)  a  Kendrick-type 
kinematic/aspect-angle  tracker  or  even  a  standard  kinematic  tracker.  Several  possibilities 
are  suggested  in  the  following  subsections. 

6.2.1  A  New  Class  of  Estimator  for  Object  Recognition  and  Tracking.  Recall  from 
Sect.  3.6  that  optimal  use  of  the  Larson  and  Peschon  equations  requires  explicit  knowledge 
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of  the  a  priori  likelihood  of  transitions  on  the  aspect  angle  space,  or  p(xjj  n  |  x£_ln, 

For  aircraft  targets,  this  information  requires  reasonable  estimates  of  the  target  acceleration 
states.  Conventional  kinematic  trackers  cannot  provide  this  information  in  real  time,  and 
their  outputs  require  some  form  of  smoothing  to  derive  reasonable  aspect  angle  estimates. 
Kinematic/aspect-angle  trackers,  on  the  other  hand,  can  provide  reasonable  acceleration 
estimates  in  real  time. 

Conversely,  the  kinematic/aspect  tracker  requires  target  pose  estimates,  which  yield 
the  aspect  angle  pseudo-measurement.  A  Larson  and  Peschon-type  algorithm  provides  this 
pose  estimate  as  x£f,.,  i.e.,  the  latest  or  Jfe-th  aspect  angle  state  in  the  Larson  and  Peschon 
(LP)-derived  maximum  likelihood  sequence  of  aspect  angle  states  for  object  class 

LJi. 

The  evident  next  step,  then,  is  to  link  a  Larson  and  Peschon-type  estimator  and  a 
Kendrick/Maybeck/ Reid- type  estimator,  so  that  each  provides  the  information  required 
by  the  other.  The  result  is  believed  to  constitute  a  new  form  of  estimator.  This  esti¬ 
mator  allows  one  to  process  (1)  information  which  does  not  conform  to  rules  appropriate 
for  linear  or  quasi-linear  estimators,  like  radar  signatures  from  an  aspect  angle  space,  in 
conjunction  with  (2)  information  that  does  conform  to  those  rules,  like  range  and  pointing 
angle  measurements  from  an  object  moving  in  physical  space.  An  equivalent,  completely 
linear  or  quasi-linear  estimator  may  be  impossible,  and  a  possibly  feasible  Larson  and 
Peschon  structure  to  process  both  kinematic  and  feature  measurements  would  have  very 
high  dimensionality  indeed.  The  proposed  Larson  and  Peschon  /  linear  filter  structure 
would  seem  to  be  a  feasible  and  efficient  way  to  process  all  available  information. 

Time  constraints  in  this  research,  and  the  disjoint  structure  of  simulations  built 
for  Steps  (1)  and  (2)  did  not  permit  demonstration  of  the  estimator  approach  for  Step 
(3).  When  demonstrated,  the  resulting  algorithm  will  allow  one  to  combine  explicitly 
(1)  the  likelihoods  of  observed  kinematics  and  kinematics-related  properties  (e.g.,  mass), 
conditioned  on  observed  signatures,  with  (2)  the  likelihoods  of  observed  signatures,  condi¬ 
tioned  on  observed  kinematics,  for  each  candidate  object  class.  This  will  be  a  true  joint 
likelihood  object  recognition  algorithm  using  all  available  information,  and  conforming  to 
well-understood  classical  parameter  estimation  practice. 
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Through  this  approach,  linear  or  quasi-linear  assumptions  and  requirements  com¬ 
mon  to  conventional  parameter  estimation  could  be  relaxed  considerably  where  necessary. 
Inherently,  however,  like  conventional  parameter  estimation  approaches,  a  recognizer  of 
this  form  will  consist  of  a  set  of  estimators,  one  for  each  expected  or  otherwise  desired 
parameter  set,  combined  in  a  Bayesian  multiple  model  estimator  structure  -  a  multiple 
model  adaptive  estimator  structure,  if  the  estimators  are  designed  to  alter  their  parameters 
according  to  observed  results. 

6.2.2  Other  Approaches  for  Object  Recognition  with  Sensor  Signature  Sequences. 
The  primary  thrusts  of  this  research  -  sequence  comparison  with  dynamic  programming, 
residual  analysis  using  kinematic/ aspect  state  estimators,  and  the  combination  of  the  two 
proposed  in  the  previous  subsection,  Me  by  no  means  the  only  approach  for  combining 
this  information.  A  completely  different  approach,  for  example,  would  be  to  treat  the 
pose  estimate  history  (Sect.  2.2.1)  and  the  kinematically-estimated  aspect  angle  history 
(Sect.  2.3. 3.1)  as  separate  tracks  on  the  two-dimensional  surfaces  of  the  hypothetical  aspect 
angle  spheres  for  each  of  several  candidate  objects.  Then,  classical  track-to-track  associa¬ 
tion  schemes  [33,  10,  218]  could  be  used  to  determine  which  track-to-track  association  was 
best,  implying  a  correct  choice  of  object  class.  For  objects  and  feature  spaces  where  pose 
estimates  are  reasonably  well-behaved  (which  is  generally  not  the  case  with  HRR  radar), 
and  perhaps  in  other  cases,  this  approach  might  be  workable,  and  have  low  computation 
requirements. 

6.2.3  Sequence  Comparison  for  A  Priori-Likely  Sequences.  The  use  of  dynamic 
time  warping  for  word  detection  in  continuous  speech  [176]  suggests  a  method  for  detecting 
aircraft  target  roll  maneuvers  at  the  time  of  their  occurrence,  i.e.,  prior  to  the  development 
of  normal  load  acceleration  and  kinematically-observable  events.  Basically,  if  we  are  track¬ 
ing  an  aircraft,  and  have  a  reasonable  idea  of  its  class  and  current  aspect  angle,  we  can 
construct  sequences  of  likely  sensor  signatures  corresponding  to  particular  roll  maneuvers. 
These  sequences  will  then  be  continuously  compared  with  the  observed  signatures  -  the 
onset  of  a  roll  maneuver  will  be  marked  by  a  sudden  “match”  between  the  incoming  sig¬ 
nature  sequence  and  one  of  the  constructed  sequences.  In  effect,  this  is  a  form  of  multiple 
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model  estimator.  Recognition  of  such  a  roll  event  would  cue  the  kinematic  tracker  for 
changing  conditions. 

A  development  of  the  “fixed  bound”  Larson  and  Peschon  or  Le  Chevalier-type  syn¬ 
tactic  algorithms  [136]  (i.e.,  requiring  no  a  priori  information  other  than  bounds)  would 
also  find  use  in  this  application.  Applying  such  an  algorithm  to  sequences  of  observations 
from  the  target  would,  for  a  stationary  aspect  angle,  indicate  a  sequence  of  maximum 
likelihood  associations  in  approximately  the  same  aspect  angle  area  In  a  turn  event,  the 
sequence  of  associations  would  proceed  off  in  a  direction  indicative  of  a  turn  -  the  change 
in  associated  aspect  angle  would  be  derived  readily  from  the  algorithm.  The  path  direction 
should  not  be  expected  to  be  as  accurate  as  would  be  the  case  if  a  priori  information  were 
available  from  kinematics,  but  it  may  be  accurate  enough  to  cue  the  kinematic  tracker. 

Extensions  of  this  concept  would  use  dynamic  programming-based  methods  or  other 
sequence  comparison  techniques  for  identifying  any  object  for  which  particular  feature 
sequences  can  be  predicted  or  expected,  in  the  absence  of  an  ability  to  estimate  the  under¬ 
lying  kinematics  or  other  state  transitions  directly.  Some  tactical  targets  have  “canned”, 
operator-learned,  or  otherwise  a  priori-likely  maneuvers  which  may  generate  characteris¬ 
tic  feature  sequences  under  cir cur. stances  in  which  target  kinematic  state  measurements 
are  not  available,  or  in  which  observable  kinematic  states  are  not  highly  correlated  to 
the  feature  values.  For  example,  particular  exoatmospheric  or  re-entering  objects  have 
characteristic  signatures  imparted  by  periodic  or  reasonably  predictable  behavior.  Other 
speech  processing-related  techniques  have  already  been  considered  in  this  particular  strate¬ 
gic  defense-related  area,  such  as  Therrien’s  use  of  Linear  Predictive  Coding  (a  technique 
often  used  in  speech  analysis)  to  model  target  signatures  [211]. 

6.3  Sequence  Comparison  Methods  for  Multi- Object  Tracking  and  Data  Association 

Here  we  consider  the  problems  of  tracking  and  classifying  multiple  objects,  i.e.,  the 
case  in  which  each  measurement  cannot  be  unambiguously  associated  with  any  one  origin 
object.  All  of  the  techniques  and  algorithms  discussed  in  this  research  for  exploiting 
the  joint  likelihood  of  observed  events  and  known  object  classes  are  applicable  to  these 
problems.  Some  cases  of  particular  interest  are  given  below. 
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Figure  6.1.  Problem:  Associating  Signatures  From  Imaging  and  Radar  Sensors  (figure 
inspired  by  [96]) 

6.3.1  Motion  Warping  for  Observation-to-Obseruation  Association.  Classically, 
the  spatial  resolution  capabilities  of  radar  and  passive  optical  sensors  are  counterposed  - 
that  is,  radar  sensors  have  limited  angular  resolution,  but  excellent  range  resolution,  while 
passive  optical  sensors  have  excellent  angular  resolution,  but  no  range  resolution  (excepting 
secondary  methods  like  stadiametric  range  finding,  based  on  knowledge  of  object  class, 
orientation,  and  angular  extent  in  the  image  plane,  etc.)  At  least  one  recent  article  [96] 
notes  that  the  problems  of  associating  objects  between  imaging  and  radar  sensors  are  not 
completely  solved. 

Consider  the  situation  described  in  Fig.  6.1,  taken  essentially  from  the  last  reference. 
This  situation  could  arise,  for  example,  in  observing  three  moving  objects  on  a  relatively  flat 
surface  at  long  distance  from  a  remotely-piloted  vehicle  equipped  with  low-cost  imaging  and 
radar  sensors.  We  wish  to  direct  artillery  fire  on  object  (A),  but  are  unable  to  determine 
its  range  unambiguously,  since  the  angular  resolution  cell  of  the  radar  encompasses  the 
three  objects  in  the  imaging  sensor  field  of  view,  and  arbitrary  orientation  of  the  objects 
makes  stadiametric  range  finding  impossible. 
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Using  the  approach  applied  in  Chapter  V,  however,  we  can  say  that  a  total  of  three 
possibilities  exist  for  associating  object  (A)  with  the  range  signatures.  If  the  object  class 
of  (A)  is  unknown,  then  at  least  in  general  we  can  expect  that  (A)  is  a  member  of  one  class 
a >j  of  J  a  priori  likely  object  classes.  Each  of  these  possibilities  defines  a  particular  object- 
trajectory  combination  on  the  ground  surface  (which  may  be  more  or  less  well  known), 
and  therefore  implies  a  particular  sequence  of  signatures  in  the  image  and  radar  domains. 
The  maximum  likelihood  association  of  (A)  with  (1),  (2),  or  (3)  and  object  class  of  J 
classes  should  be  the  one  for  which  the  joint  likelihood  of  observed  kinematics  and  signature 
sequences  (in  image  and  radar  domains)  is  greatest.  Thus,  the  approaches  defined  in  this 
research  may  provide  solutions  for  problems  now  c  'nsidered  to  be  unsolvable. 

An  extension  to  this  fusion  process  might  provide  for  feedback  to  improve  the  sensor 
outputs  themselves  -  for  example,  by  providing  a  “best”  choice  for  a  model  and  time- 
varying  orientation  to  be  used  in  a  model-based  segmentation  algorithm  [27,  29], 

6.3.2  Joint  Likelihood  in  Observation- to- Track  Assignment.  As  in  the  preceding 
problem,  one  could  augment  the  decision  process  for  observation-to-track  assignment  with 
“motion  warping”  and  related  means  for  exploiting  the  joint  likelihood  of  kinematic  mea¬ 
surements  and  sensor  signatures.  This  development  should  provide  a  tangible  improvement 
over  state-of-the-art  approaches  as  discussed  in  Sect.  2.3.3.  Inherently,  the  maintenance 
of  a  track  history  gives  a  departure  point  for  likely  kinematics,  which  imply  likely  fea¬ 
ture  observations,  and  so  on.  At  least,  this  research  should  persuade  researchers  to  treat 
“nonkinematic”  variables  and  measurements  as  interdependent  with  “kinematic”  ones,  and 
provide  options  for  exploring  their  relationships  in  multisensor  fusion. 

6-4  Mathematical  Issues  in  Multisensor  Fusion  for  Recognition 

In  an  early  article  on  pattern  recognition,  Ho  and  Agrawala  [lt8]  observed  that  many 
concepts  well  understood  in  the  context  of  state  and  parameter  estimation  were  applicable 
in  pattern  recognition.  This  remains  true,  and  in  this  section  we  will  discuss  particular 
directions  that  evidently  remain  to  be  explored  regarding  the  application  of  state  and 
parameter  estimation  concepts  to  dynamic  object  recognition. 
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6.4.1  Parameter  Estimation  and  Object  Recognition.  In  general,  current  object 
recognition  efforts  tend  to  focus  on  assigning  sets  of  measurements  from  an  unknown  object 
to  one  element  in  a  finite  set  of  a  priori  known  object  classes.  It  is  believed,  however,  that 
a  potentially  fruitful  area  of  research  would  be  to  consider  continuous  object  parameter 
spaces,  and  the  use  of  measurements  to  define  points  in  these  parameter  spaces  with 
maximum  likelihood  of  having  generated  the  measurements.  As  discussed  in  Sect.  3.11, 
real  object  parameter  spaces  are  infinite- dimensional,  but  we  are  forced  to  model  them  in 
finite-dimensional  spaces,  choosing  levels  of  model  fidelity  that  match  real  objects  to  within 
some  distance  in  some  norm  or  distance  measure.  Using  the  mathematics  of  functional 
analysis  [170],  it  should  be  possible  to  pose  and  answer  many  elegant  questions  regarding 
our  ability  to  model  or  identify  real  objects  using  finite- dimensional  models. 

The  practical  application  of  this  research  is  that  it  allows  us  to  consider  object 
identification  when  the  unknown  object  class  is  not  one  of  those  well  characterized  a  pri¬ 
ori.  We  may  obtain  measurements  of  a  completely  new  object,  and  be  able  to  make 
conjectures  about  its  parameter  set  in  our  finite-dimensional  model  space,  based  upon 
well- characterized  points  in  the  parameter  space  which  correspond  to  known  objects.  This 
is  no  more  than  a  proposal  to  use  well-known  parameter  estimation  concepts  [154]  for 
estimating  or  learning  object  parameters  during  recognition. 

As  noted  in  Sect.  2.7,  the  ability  to  compute  an  ambiguity  function  gives  us  the 
ability  to  define  a  Cramer-Rao  lower  bound  on  the  parameter  estimate  error  covariance 
matrix.  For  a  set  of  parameters  “learned”  as  proposed  in  the  previous  paragraph,  this 
lower  bound  gives  us  a  measure  of  confidence  regarding  the  quality  of  our  estimate.  Even 
where  we  do  not  desire  to  learn  parameters,  this  covariance  bound  should  give  us,  for 
two  arbitrary  loci  of  points  in  state/parameter  space  corresponding  to  well-characterized 
object  classes  engaged  in  some  particular  maneuver,  an  understanding  of  how  well  we  can 
expect  to  discriminate  between  them  with  our  classifier.  Due  to  the  limited  number  of 
observations  likely  in  a  tactical  scenario,  this  will  require  not  only  a  well-shaped  ambiguity 
function,  but  also  likelihood  functions  which  have  probability  density  functions  closely  held 
around  the  mean,  or  ambiguity  function,  value. 
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6-4-2  The  Motion  Warping  Process  as  a  Nonlinear  Functional.  As  discussed 
in  App.  B,  a  nonparametric  linear  classifier  is  a  linear  functional  [170]  -  that  is,  a  linear 
function,  operating  on  a  vector  (feature)  space  over  a  scalar  field  (the  real  numbers,  in 
this  case),  the  output  of  which  is  an  element  from  the  scalar  field  (a  distance).  The 
mathematical  structure  of  linear  functionals  prescribes  the  existence  of  a  vector  space 
dual  to  the  feature  space,  in  terms  of  which  dual  space  the  parameters  of  the  functional 
can  be  stated  -  specifically,  the  equation  of  the  hyperplane  corresponding  to  the  linear 
classifier  is  described  by  a  scalar  and  a  point  in  the  dual  space,  which  is  equivalent  to 
the  first  n  dimensions  in  the  n  +  1-dimensional  space  called  the  weight  space  by  Tou  and 
Gonzalez  [212], 

Similarly,  the  motion  warping  processes  and  other  joint  likelihood  methods  described 
in  this  research  are  nonlinear  functionals  -  nonlinear  functions,  operating  on  a  Cartesian 
product  space  of  aspect  angles,  features,  and  kinematic  measurements,  to  produce  a  scalar 
output  (likelihood,  or  perhaps  simply  warping  path  distance).  Many  of  these  algorithms  are 
subject  to  further  classification  as  functionals:  for  example,  for  the  feature  observable-to- 
feature  observable  (e.g.,  range  sweep  to  range  sweep)  Mahalanobis  metric  employed  in  this 
research  (which  treats  the  range  sweeps  as  elements  in  a  finite-dimensional  Hilbert  space), 
it  can  readily  be  shown  that  motion  warping  is  a  bounded  nonlinear  functional  [170:344]. 
These  observations  may  provide  a  path  for  further  theoretical  analysis  of  the  concepts 
proposed  in  this  research. 

6-4-3  Sequence  Comparison  for  Multiresolution  Analysis.  Recent  efforts  have 
been  directed  toward  the  use  of  wavelets,  optimal  estimators,  and  other  tools  for  anal¬ 
ysis  of  multiresolution  or  multiscale  processes  [150,  17,  54].  In  this  regard,  it  should  be 
noted  that  the  sequence  comparison  techniques  discussed  in  this  dissertation  inherently 
have  potential  for  multiresolution  applications.  For  example,  one  can  conduct  sequence 
comparison  processes  for  a  range  of  “fineness”  or  spatial  discretization  levels.  The  results 
of  these  comparisons  may  reveal  the  presence  of  grammatical  or  syntactic  structures  rang¬ 
ing  from  coarse  to  fine  in  the  space  over  which  the  comparisons  are  conducted.  Iterative 
algorithms  may  provide  for  progressively  finer  investigation  of  the  likelihood  that  particu- 
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lar  spaces  generated  particular  sequences  or  subsequences,  ceasing  investigation  when  finer 
resolution  appears  pointless. 

6.5  Alternative  Approaches  for  Sequence  Comparison. 

The  research  detailed  here  has  focused  on  forward  dynamic  programming-based  tech¬ 
niques  for  sequence  comparison,  but  these  are  not  the  only  tools  for  this  task  -  there  are 
many  methods  for  comparing  sequences.  The  principal  competitor  to  dynamic  time  warp¬ 
ing  in  the  speech  recognition  problem  is  the  hidden  Markov  model  (HMM).  In  turn,  the 
HMM  can  be  considered  as  a  form  of  stochastic  automaton  [176:309],  which,  like  dynamic 
programming-based  sequence  comparison,  is  a  key  tool  in  general  syntactic  pattern  recog¬ 
nition,  as  discussed  in  Chapter  II.  Another  alternative  class  of  approaches  is  offered  by 
neural  networks.  The  author  chose  not  to  investigate  these  alternatives,  primarily  due 
to  what  he  perceives  as  disadvantages  to  them,  as  alluded  to  in  Chapter  I.  Subsequent 
to  the  start  of  this  research,  some  aspects  of  these  approaches  have  been  investigated  at 
AFIT  [128,  68,  83],  as  noted  in  Chapter  II,  and  independent  effort  using  these  methods 
has  been  undertaken  elsewhere  [81,  199]. 

A  key  issue  to  be  addressed  by  researchers  applying  HMM’s,  neural  nets,  and  other 
“trained”  algorithms  to  this  problem  is  how  to  train  the  algorithms  for  a  variety  of  expected 
scenarios  without  introducing  ambiguity  into  the  recognition  process.  There  may  be  a  need 
for  research  to  define  the  tradeoffs  between  maintaining  large  libraries  from  which  sequences 
can  be  constructed  at  will  (as  in  this  research),  versus  maintaining  large  numbers  of  HMM’s 
and/or  neural  nets,  each  defined  for  a  particular  set  of  events. 

6.6  Miscellaneous  Issues  and  Extensions  To  This  Research 

Aside  from  the  further  algorithm  developments  proposed  above,  a  number  of  perhaps 
more  mundane  but  critical  remaining  issues  can  be  explored  using  the  basic  algorithms 
employed  for  this  research.  These  include,  but  are  not  limited  to,  the  following: 

6.6.1  Feature  Space  and  Distance  Metric  Choices.  This  research  has  focused 
on  the  use  of  “motion  warping”  or  dynamic  programming-based  sequence  comparison  for 
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moving  object  recognition  using  a  feature  space  of  high  range  resolution  radar  sweeps.  The 
method,  however,  should  be  applicable  to  any  of  the  global  descriptor  features,  as  generated 
by  imaging  or  other  sensors,  discussed  in  App.  B.  A  particularly  powerful  approach  would 
appear  to  be  that  of  making  syntactic  or  motion  warping  object  recognition  based  on 
simultaneous  analysis  of  feature  observables  from  multiple  domains  -  infrared,  visual  light, 
and  ultraviolet  spectrum  or  image-extracted  features,  narrowband  and  wideband  (high- 
range  resolution)  radar,  and  so  on.  The  objective  here  would  be  to  exploit  the  concept 
of  joint  likelihood  further  by  making  use  of  the  processes  in  two  or  more  feature  domains 
(ideally,  stochastically  independent  ones)  to  reduce  the  likelihood  further  that  an  incorrect 
object  class  could  generate  observed  feature  sequences. 

6.6.2  Application  to  Other  Classes  of  Objects.  The  discussion  in  this  dissertation 
has  been  primarily  oriented  toward  air  and  ground  tactical  targets.  The  principles  and 
analysis  tools  discussed  here,  however,  should  be  applicable  to  many  other  object  and  target 
classes  -  animate  beings,  ships,  satellites,  etc.  The  requirement  is  simply  to  have  some 
set  of  restrictions  which  couple  state  transitions  and  measurement  generation  according  to 
known  rules  (parameters  unique  to  each  class),  combined  with  measurements  from  multiple 
sources  that  will  conflict  when  incorrect  parameter  sets  are  assumed. 

Also,  although  we  have  spoken  most  often  with  regard  to  dynamic  object  recognition 
in  this  effort,  it  should  be  clear  that  the  principles  behind  this  research  also  apply  to  the 
recognition  of  stationary  objects.  Simply,  if  it  is  known  that  a  given  object  is  stationary 
in  aspect  angle  and/or  translation  with  respect  to  the  sensor,  one  should  not  match  noisy 
sensor  signatures  from  that  target  with  library  models  in  a  way  that  allows  or  requires  the 
models  to  translate  and/or  rotate  between  signatures.  To  do  so  is  to  invite  an  incorrect 
recognition. 

6.6.3  Path  Discretization.  In  virtually  every  discussion  of  dynamic  programming- 
based  processes,  the  discussion  recognizes  that  continuous  spaces  must  be  discretized,  but 
gives  no  guidance  as  to  the  appropriate  discretization  fineness.  In  Sect.  5.6,  we  discussed 
one  effect  of  inadequate  discretization  observed  using  the  RCSTooLLBox-generated  feature 
space  and  the  Mahal anobis  metric.  The  sensitivity  of  the  dynamic  programming-based 
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algorithms  proposed  here  to  discretization  fineness  should  be  further  evaluated.  The  am¬ 
biguity  function  techniques  discussed  above  provide  a  natural,  if  empirical,  way  to  conduct 
such  analysis. 

That  is,  considering  likelihood  function  values  due  to  feature  observable  matching 
only,  one  attribute  of  a  proper  discretization  level  may  be  the  absence  of  a  difference  at 
the  true  parameter  point  f lt  between  ambiguity  function  values  for  sequence-matching 
and  non-sequence-matching  algorithms.  This  means  that,  on  the  true  target,  there  is  no 
cost  penalty  for  sequence  matching  algorithms  due  to  “false”  pose  estimate  motion  on  the 
target,  like  that  caused  when  the  matching  process  “hunts”  to  find  the  best  cell  from  the 
available  discrete  set  which  might  have  generated  the  observed  measurements  from  the 
underlying  continuous  space  (in  our  case,  aspect  angle). 

6.6.4  Alternative  Kinematic/ Aspect- Angle  Filter  Designs.  The  effects  and  pos¬ 
sibilities  noted  in  Chapter  IV  deserve  to  be  repeated  using  filter  models  with  higher  fi¬ 
delity  -  in  particular,  with  parameters  and  aspect  angle  pseudo-measurements  matched  to 
real  objects  and  sensors  of  interest.  This  proposal  is  made  with  regard  to  a  conventional 
kinematic/aspect-angle  estimator  construct,  i.e.,  one  that  does  not  explicitly  predict  and 
compare  feature  observable  measurements. 

6.6.5  Application  of  the  Larson  and  Peschon  Equations  in  Speech  Recognition  and 
Other  Areas.  In  Sects.  2.4.5  and  3.8,  we  compared  the  Larson  and  Peschon  equations 
to  elements  from  the  family  of  classical  dynamic  programming-based  sequence  comparison 
methods.  We  noted  that  the  Larson  and  Peschon  equations  are,  from  a  general  perspec¬ 
tive,  simply  one  member  of  this  family,  distinguished  from  other  members  by  very  special 
continuity  constraints.  This  observation  leads  one  to  consider  the  possibility  of  apply¬ 
ing  the  Larson  and  Peschon  equations  in  classical  sequence  comparison  tasks,  like  speech 
recognition.  The  immediate  advantage  to  consider,  as  observed  in  the  referenced  sections, 
is  that  “Larson  and  Peschon”-type  sequence  comparison  would  not  be  penalized  for  length 
differences  between  two  sequences  to  be  compared.  As  we  have  seen  in  Chapter  V,  subject 
to  its  a  priori  guidance  on  sequence  element  separation,  the  Larson  and  Peschon  approach 
can  readily  match  a  very  short  sequence  of  n  elements  to  the  “best”  n  elements  from  a 
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very  long  sequence  of  m  elements  (n  <  m).  From  that  result,  we  might  desire  in  some 
fashion  to  compare  the  “skipped”  elements  of  the  longer  sequence  to  the  nearest  elements 
of  the  shorter  sequence,  but  at  least  we  have  a  good  starting  point  -  much  better  than  we 
would  be  likely  to  achieve  with  classical  “continuous”  sequence  comparison. 

For  other  possible  applications  of  the  Larson  and  Peschon  approach  (and  other  as¬ 
pects  of  the  research  presented  here)  with  regard  to  ISAR  radar  and  related  areas,  the 
reader  is  referred  to  a  previous  article  by  Le  Chevalier  [137].  Note  that  the  referenced 
article  is  classified  and  in  French,  without  translation  in  the  origin  document. 

6. 7  Summary. 

This  chapter  has  attempted  to  point  the  reader  toward  promising  areas  for  funda¬ 
mental  research,  and  to  list  applications  for  the  approaches  and  algorithms  developed  in 
this  effort.  It  is  clear  that  much  more  can  be  done  to  apply  syntactic  pattern  recognition 
and  classical  parameter  estimation  techniques  to  multisensor  fusion  and  object  recognition. 

Syntactic  or  joint  likelihood  approaches  to  object  recognition  should  always  be  con¬ 
sidered  when  the  objects  of  interest  obey  known  rules  that  couple  behavior  (state  dynamics) 
and  appearance  (measurable  or  observable  quantities)  in  characteristic  ways  (as  functions 
of  parameters  unique  to  each  object  class).  This  chapter  shows  promising  directions  for  ap¬ 
plying  the  results  of  this  research  and  other  methods  toward  syntactic  solutions  to  current 
problems. 
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VII.  Summary  and  Conclusion 


This  research  explored  methods  by  which  the  obvious  interrelationships  between 
target  motion  and  sensor  signatures  could  be  exploited  for  recognition  of  dynamic  objects 
in  general  and  tactical  targets  in  particular.  A  wide  variety  of  methods  and  directions  were 
taken  in  this  effort,  but  all  culminate  in  one  basic  result  -  producing  likelihood  functions 
that  exploit  the  joint  or  coupled  nature  of  physical  processes  over  time  for  known  object 
classes,  to  reduce  ambiguity  in  deciding  the  most  likely  object  class  to  have  produced  a 
given  sequence  of  measurements  or  observations. 

This  research  can  be  understood  from  at  least  three  different,  yet  fundamentally 
equivalent  perspectives.  First,  from  a  probabilistic  standpoint,  we  have  defined  new  ex¬ 
pressions  for  the  joint  likelihood  of  observed  events  from  known  target  classes,  conditioned 
on  past  measurements  and  a  priori  information  for  each  class.  Unlike  conventional  pattern 
recognition  approaches,  we  are  specifically  interested  in  the  joint  likelihood  of  events  over 
time,  since  we  wish  to  consider  the  likelihood  of  physical  processes  implied  by  those  events. 
Ideally,  these  likelihoods  would  be  used  with  Bayesian  methods  to  estimate  tin  a  posteri¬ 
ori  probability  of  class  membership  for  an  unclassified  target,  conditioned  on  all  available 
information. 

Second,  this  research  can  be  posed  as  an  extension  to  the  theory  and  practice  of 
syntactic  pattern  recognition  -  that  is,  the  class  of  algorithms  which  distinguish  entities 
using  order  or  structure,  commonly  by  representing  an  observed  entity  as  a  sequence  of 
elements  over  time,  and  assessing  the  likelihood  that  any  known  class  could  generate  the 
observed  sequence.  For  known  physical  objects,  the  process  of  generating  sequences  or 
sequence  spaces  of  expected  elements  for  each  class  inherently  considers  the  joint  or  coupled 
nature  of  the  processes  which  produce  the  sequences.  We  may  then  apply  a  range  of 
available  syntactic  sequence  comparison  tools  to  associate  the  observed  sequence  with  its 
most  likely  origin  class. 

Lastly,  this  research  can  be  considered  as  having  defined  techniques  for  restricting 
the  domains  of  likelihood  functions  used  to  identify  known  object  classes,  making  restric¬ 
tions  according  to  joint  or  coupled  aspects  of  behavior  expected  over  time.  Restricting 
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the  domain  of  an  object  class-specific  likelihood  function  or  matching  function  tends  to 
prevent  that  function  from  improperly  associating  itself  with  a  sequence  of  measurements 
that  should  not,  all  facts  considered,  have  come  from  that  object  class.  For  the  correct 
combination  of  observed  sequence  and  likelihood  function,  proper  domain  restriction  will 
not  hinder  the  association  process. 

The  results  of  this  research  are  major  extensions  to  the  theory  and  practice  of  se¬ 
quence  comparison  methods  in  dynamic  object  recognition,  as  pioneered  by  Therrien  [211], 
Le  Chevalier  [136]  and  Mieras  [164,  165].  Furthermore,  they  strongly  confirm  the  observa¬ 
tions  of  Agrawala  and  Ho  [108]  and  Daum  [9:177-178]  regarding  the  potential  applicability 
of  state  and  parameter  estimation  techniques  in  object  recognition. 

The  theoretical  and  practical  contributions  of  this  research  are  restated  below  from 
Chapter  HI,  and  demonstrated  results  from  Chapters  IV  and  V  are  referenced  to  support 
each  claim.  They  include: 

(1)  Extension  of  conventional  multiple  model  residual  sequence  analysis  techniques  and 
kinematic/aspect-angle  trackers  to  provide  new  methods  for  object  and  target  recognition, 
in  particular  where  sensor  measurements  are  not  linearly  predictable.  The  techniques 
described  in  Sect.  3.4  and  demonstrated  in  Chapter  IV  represent  a  completely  new  approach 
to  recognition  of  dynamic  objects  in  general  and  aircraft  targets  in  particular.  The  potential 
effectiveness  of  these  methods  can  be  gauged  by  comparing  the  likelihood  values  for  correct 
and  incorrect  associations  in  Figs.  4.9  and  4.10,  or  the  state  estimates  for  correct  and 
incorrect  associations  in  Figs.  4.5  and  4.6. 

(2)  Extension  of  the  Larson  and  Peschon  equations  [133]  (Sects.  2.4.4  and  3.6)  to  provide 
new  methods  for  dynamic  object  recognition  using  measurements  from  ambiguous  feature 
observable  spaces,  considering  -  in  an  optimal  fashion  -  a  priori  information  from  kinemat¬ 
ics  and  other  sources  as  to  the  likelihood  of  transitions  on  the  underlying  aspect  angle  state 
space.  The  reduced  ambiguity  provided  by  this  approach  over  conventional  “independent 
look”  processes  is  demonstrated  in  Figs.  5.27  through  5.35  -  up  to  20%  improvement  in 
percentage  of  correct  recognition  is  observed  in  this  particular  application. 
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(3)  Extensions  of  the  theory  and  practice  of  classical  dynamic  programming-based  sequence 
comparison  to  include  feature  observable  sequences  arising  from  an  aspect  angle  subspace. 
The  reader  may  compare  Figs.  2.8  and  3.8  to  observe  this  extension  graphically.  Results 
showing  reduced  ambiguity  for  these  algorithms  compared  to  conventional  approaches  are 
seen  also  in  Figs.  5.27  through  5.35. 

(4)  Combination  of  the  of  the  Larson  and  Peschon  equations  with  conventional  linear 
estimators  to  provide  a  new  form  of  estimator,  suitable  in  particular  for  object  recognition 
with  ambiguous  feature  observables,  generated  from  dynamic  subspaces  that  exhibit  linear 
behavior  in  some  respects.  This  process  is  described  in  Sects.  3.9  and  6.2.1. 

(5)  Through  contributions  (1)  through  (4)  and  application  of  Bayes’  Rule,  several  new 
approaches  for  multisensor  fusion  to  obtain  a  posteriori  probabilities  of  object  class  mem¬ 
bership,  conditioned  jointly  on  kinematic  and  “nonkinematic”  or  feature  observable  infor¬ 
mation  and  a  priori  information  on  each  known  object  class.  In  particular,  Eqn.  (3.18), 
a  development  of  the  Larson  and  Peschon  equations,  is  a  completely  new  expression  for  a 
posteriori  probability  of  target  class,  allowing  the  user  to  fuse  ambiguous  sensor  signatures 
and  kinematic  information  in  an  optimal  fashion  for  object  recognition.  Other  relevant  ex¬ 
pressions  are  Eqns.  (3.2)  and  (3.21).  These  equations  can  be  readily  used  with  likelihood 

♦ 

values  generated  as  shown  in  Chapter  IV  and  V. 

(6)  Identification  of  a  new  method  for  evaluating  object  recognition  algorithms  -  the  gen¬ 
eralized  ambiguity  function.  This  approach  provides  a  useful  alternative  to  the  conven¬ 
tional  performance  approach  of  evaluating  probabilities  of  correct  and  incorrect  recognition. 
Figs.  5.27  through  5.35  include  severed  examples  of  performance  evaluation  by  generalized 
ambiguity  functions. 

(7)  Extension  of  classical  parameter  space  concepts  into  the  field  of  object  and  target 
recognition.  In  particular,  the  discussion  of  Sect.  3.11.3  and  the  target  model  generation 
procedures  in  Sect.  5.3,  combined  with  later  results  in  Chapter  V,  motivate  the  need  to 
consider  objects  as  discrete  sets  of  parameters  in  continuous/discrete  parameter  spaces, 
and  demonstrate  new  methods  for  allowing  that  consideration. 
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(8)  Through  contributions  (6)  and  (7),  identification  of  a  new  and  practical  approach  for 
obtaining  a  Cramer-Rao  lower  bound  for  dynamic  object  and  target  recognition  algorithms. 
The  potential  in  this  approach  can  be  noted  by  comparing  the  discussion  in  Sects.  2.7 
and  3.11.3  with  the  generalized  ambiguity  function  plot  of  Fig.  5.31. 

In  sum,  this  research  provides  a  wide  range  of  new  approaches  for  fusing  kinematic 
and  “nonkinematic”  or  sensor  signature  information  in  dynamic  object  recognition,  and 
evaluating  the  results  of  that  fusion.  Multisensor  fusion  of  target  kinematic  and  sensor 
signature  information  remains  an  exceptionally  rich  and  promising  field.  The  research 
described  here  has  illuminated  significant  new  directions  for  research  in  multisensor  fusion 
and  target  recognition.  First,  multiple  model  estimators  and  syntactic  sequence  comparison 
methods  offer  powerful  tools  for  exploiting  the  joint  likelihood  of  physical  processes  over 
time.  Second,  treating  tactical  targets  as  collections  of  parameters  executing  particular 
state  transitions  over  time  allows  one  to  apply  a  very  large  family  of  existing  estimation 
and  control  approaches  to  dynamic  object  and  target  recognition.  The  methods  used  in 
this  research  are  but  members  of  this  family  -  many  other  methods  await  application. 
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Appendix  A.  Definitions 

Algorithm  Q  An  algorithm  developed  by  General  Dynamics  [20]  for  target  recognition 
by  high  range  resolution  radar,  or  the  physical  realization  of  this  algorithm  as  a  set 
of  Fortran  subroutines.  See  slide  distance,  below  (note:  the  elements  of  General 
Dynamics  responsible  for  this  effort  were  absorbed  by  Hughes  in  1993). 

ambiguity  function,  generalized  ambiguity  function  In  general  radar  usage,  an  am¬ 
biguity  function  [57,  15,  16]  is  a  real  function  of  two  variables,  time  delay  and  fre¬ 
quency,  which  provides  sin  indication  of  a  radar’s  ability  to  distinguish  between  two 
targets  with  different  ranges  and/or  doppler  shifts.  Ambiguity  functions  are  used  in 
radar  pulse  waveform  design.  The  generalized  ambiguity  function  (see  Eqn.  (2.37)) 
is  the  expected  or  mean  value  of  a  generalized  likelihood  function  defined  for  various 
conditions,  taken  over  all  possible  measurement  values  for  some  set  of  true  condi¬ 
tions,  and  provides  an  indication  of  the  ability  of  a  likelihood  function  to  distinguish 
between  two  state/parameter  sets. 

angle  of  attack  The  angle,  generally  and  in  this  work  denoted  a,  between  the  body  x6 
vector  and  the  projection  of  an  aircraft’s  wind  vector  (or  velocity  vector,  for  an 
atmosphere  at  rest  with  respect  to  the  inertial  frame  as  assumed  in  this  work)  onto 
the  body  xb  -  z6  plane.  See  Fig.  5.23. 

a  posteriori  The  definition  of  this  term  of  interest  to  us,  from  the  several  provided  by 
Webster,  is  “based  on  observation  or  experience;  empirical”  (as  opposed  to  “a  pri¬ 
ori").  In  our  case,  this  term  refers  to  the  whole  or  sum  of  information  available  about 
some  event  or  system  after  additional  or  new  information  is  extracted  from  some  set 
of  measurements  or  observations  and  combined  with  a  priori  knowledge. 

a  priori  The  definition  of  this  term  of  interest  here,  from  the  several  provided  by  Webster, 
is  “before  examination  or  analysis”  (as  opposed  to  “a  posteriori').  In  our  case,  this 
term  refers  to  information  available  about  some  event  or  system  before  additional 
or  new  information  is  extracted  from  some  set  of  measurements  or  observations.  A 
priori  knowledge  may  be  based  on  previous  measurements  or  observations,  theoretical 
knowledge,  other  sources,  or  may  not  exist. 
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aspect  angle  path  (tru'  or  estimated)  The  path  traced  on  the  hypothetical  aspect  an¬ 
gle  sphere  (see  reference  below)  of  unit  radius  by  the  object-to-sensor  unit  vector. 
May  be  true,  based  on  true  object  orientation  and  sensor  location,  or  estimated,  as 
from  kinematic  measurements,  state  estimation,  and  assumptions  as  to  orientation 
required  for  given  kinematics. 

aspect  angle  region,  subset  See  aspect  angle  space  first.  A  region  on  or  subset  of  the 
hypothetical  aspect  angle  sphere  (of  unit  radius).  If  an  aspect  angle  region  or  subset 
is  one- dimensional,  it  is  equivalent  in  description  to  an  aspect  angle  path. 

aspect  angle  space  The  entire  4ir  steradi^n  (two-dimensional)  extent  of  the  hypothetical 
aspect  angle  sphere  (equivalently,  the  surface  of  the  unit  sphere)  is  closed  under 
allowable  transitions  on  that  sphere,  and  is  therefore  considered  an  aspect  angle 
“space.”  A  third  dimension  of  aspect  angle  (“roll”  about  the  sensor-object  vector) 
may  be  required  to  specify  angular  relationships  for  signatures  that  are  not  in-plane 
rotation  invariant,  or  where  the  full  three  degree-of- freedom  rotational  state  of  the 
object  is  under  consideration.  See  Sect.  3.2. 

aspect  angle  sphere  See  “hypothetical  aspect  angle  sphere.” 

bank  angle  An  Euler  angle  about  the  body  frame  xb  axis,  last  in  a  series  of  three  Euler 
angles  (azimuth,  elevation,  and  bank)  required  to  rotate  the  navigation  (inertial) 
frame  into  the  body  frame.  This  is  not  the  same  as  roll  angle  (see  roll  angle  below 
and  [79]). 

bijective  A  term  used  to  describe  a  mathematical  function  that  is  both  “one-to-one”  and 
“onto”  (these  terms  defined  below). 

b»n  Generally,  an  interval  in  some  space,  usually  one  interval  of  a  set  of  equal-width 
intervals  defined  by  a  partitioning  of  a  simply  connected  subset  (a  larger  interval)  of  a 
one- dimensional  space.  Values  of  some  function  over  the  space  at  some  point  or  points 
which  fall  within  the  interval  of  a  given  bin  may  be  assigned,  summed,  01  integrated 
into  that  bin,  allowing  the  distribution  of  the  (possibly  continuous)  function  values  to 
be  approximated  by  discrete  bin  numbers  and  values.  Representation  of  observed  or 
measured  function  values  as  bin  “return”  values  is  often  forced  by  limited  resolution 
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of  the  measurement  system  (i.e.,  sensor  resolution  equivalent  to  bin  width)  or  a  desire 
to  use  clustering  techniques  on  the  output. 

cross-correlation  In  stochastic  processes,  a  measure  of  the  interdependence  between  two 
(possibly  vector- valued)  random  variables,  found  by  taking  the  expected  value  of  the 
outer  product  of  the  two  variables  over  all  realizations  [153:93].  The  one- dimensional 
signed  (cross-)  correlations  c(n)  for  two  signals  or  images  a  and  b  represented  as 
discrete  sequences  (see  [93,  99,  178])  or  vectors  of  K  or  more  elements  are  defined  by 
equations  of  the  basic  form: 


K 

c(n)  =  ^2  a(k)b(k  —  a) 
it= i 

In  general  it  is  aesired  to  find  the  value  of  n  which  maximizes  this  expression.  This 
expression  can  be  extended  to  provide  for  two  or  more  dimensions  in  a  and  b,  and 
to  provide  for  normalization  by  the  “energy”  (or  an  analogous  measure)  in  a  and  b. 

decision  theoretic  (pattern  recognition)  Pattern  recognition  systems  which  call  for 
each  pattern  class  to  be  represented  by  a  point  or  collection  of  points  in  a  multi¬ 
dimensional  feature  space,  each  dimension  of  which  is  parameterized  by  a  continuous 
or  discrete  set  of  values  of  some  measurable  quantity,  a  feature.  Where  more  than  one 
point  is  assigned  to  a  pattern  class,  each  point  generally  represent  one  realization 
of  a  stochastic  (i.e.,  “random”)  process  which  generates  the  feature  values  for  that 
class.  The  process  is  generally  stochastic  in  four  respects,  due  to  unknown  or  a  priori 
unpredictable  variations  in  (1)  class  parameters,  (2)  class  state,  (3)  environment  (in 
particular,  as  affects  the  object-sensor  line-of-sight),  or  (d)  sensor  system. 

doppler  velocity  (measurement)  A  quantity  derived  from  a  returned  waveform  (gen¬ 
erally  emitted  originally  by  the  receiving  sensor  -  that  is,  a  monostatic  sensor  ar¬ 
rangement)  which  indicates  the  relative  velocity  between  the  sensor  and  the  observed 
object,  based  on  (doppler)  frequency  shifts  in  the  waveform  upon  reflection  at  the 
observed  object.  For  a  monostatic  sensor,  doppler  velocity  is  a  measurement  only 
of  the  component  of  sensor-object  relative  velocity  along  the  sensor-to-object  vec- 
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tor,  and  provides  no  information  on  object  velocity  normal  to  that  vector.  Doppler 
velocity  can  be  obtained  from  radars  of  the  type  considered  in  this  research  [16,  203]. 

dynamic  programming  A  technique  for  finding  a  best  or  “shortest”  path  through  a 
discrete,  finite  set  of  possible  decisions,  where  that  path  obeys  the  Principle  of  Op¬ 
timality  (see  Sect.  2.4.1). 

dynamic  time  warping  A  technique  for  assessing  the  similarity  of  two  finite  sequences 
of  speech  features  or  feature  vectors,  as  “discretized”  or  extracted  from  windowed  or 
partitioned  continuous  speech,  by  removing  differences  due  to  expansion  or  contrac¬ 
tion  of  one  sequence  relative  to  the  other  (see  Sect.  2.4.2). 

dynamics  The  branch  of  mechanics  that  deals  with  forces,  masses,  and  their  relation 
primarily  to  motion,  but  also  sometimes  to  equilibrium  conditions  (as  opposed  to 
statics,  which  deals  only  with  force  and  mass  in  absence  of  motion,  and  kinematics, 
which  is  defined  below;  see  [144]). 

extended  Kalman  filter  A  class  of  mathematical  estimators  derived  from  the  (linear) 
Kalman  filter,  designed  to  make  “near-optimal”  estimates  of  quantities  for  which 
time-propagation  and/or  measurement  equations  are  nonlinear,  using  truncated  Tay¬ 
lor  series  and  relinearizations  about  current  values  (see  Sect.  2. 3. 1.1). 

feature  This  term  has  basically  two  meanings  in  pattern  recognition,  depending  on  the 
form  of  pattern  recognition  considered.  In  decision  theoretic  or  syntactic  pattern 
recognition,  a  feature  is  some  quantity  or  property  pertaining  to  object  classes 
of  interest  which  can  be  measured  or  observed,  but  which  may  or  may  not  pro¬ 
vide  unambiguous  information  as  to  class  membership.  In  point  or  <-of  points 
correspondence-based  image  recognition  techniques,  however,  a  feanir.  me  por¬ 
tion  of  the  object  image  pertaining  to  object  classes  of  interest  which  is  believed  to 
be  meaningful  to  the  recognition  process  (a  landmark),  but  which  may  or  may  not 
be  easily  measurable  and  usable  by  the  recognizer  -  the  object  of  these  correspon¬ 
dence  systems  is  to  find  correspondences  between  points  on  the  unclassified  object 
and  features  on  a  library  image  (see  App.  B). 


feature  attribute  Generally  equivalent  to  the  first  meaning  given  for  the  term  feature. 
See  feature  observables  below. 

feature  observables  Equivalent  to  the  first  meaning  given  for  the  term  feature,  this  term 
is  preferred  by  the  author  over  the  term  feature  attribute  because  of  the  emphasis 
given  to  the  fact  that  the  sensor  can  observe  or  measure  these  quantities  (presumably 
a  object  could  have  an  “attribute”  that  is  not  observable). 

feature  space  A  vector  space  in  which  a  given  feature  observation  or  measurement  of  the 
object  can  be  expressed  as  a  point.  Thus,  the  dimension  of  the  feature  space  is  deter¬ 
mined  by  the  number  of  individual  scalar  elements  (which  may  not  be  independent 
in  any  sense  of  the  word)  in  a  feature  observable  measurement  vector.  Feature  spaces 
of  dimension  n  are  generally  taken  to  be  isomorphic  [170:173]  to  Rn ,  the  (Cartesian) 
vector  space  defined  by  a  collection  of  n  mutually  orthogonal  real  axes  sharing  a 
common  origin  (see  App.  B). 

Are  control  The  process  of  controlling  the  operation  of  a  classical  gun  or  unguided  rocket- 
type  weapon  system,  i.e.,  a  weapon  system  firing  projectiles  which  cannot  be  con¬ 
trolled  in  flight,  for  which  prediction  of  target  trajectory  over  the  projectile  time  of 
flight  is  the  critical  concern  in  aiming. 

flight  path  angle  The  angle  between  an  aircraft’s  velocity  vector  (not  generally  equal  to 
the  wind  vector)  and  the  local  horizontal  plane  (see  Fig.  5.23). 

global  descriptor  A  (generally  scalar)  value  which  is  in  general  a  function  of  the  entire 
(global)  extent  of  a  object,  as  isolated  (segmented)  and  observed  by  some  sensor  (see 
App.  B). 

heading  angle  The  angle  between  local  north  and  the  (vertical)  projection  of  an  aircraft’s 
velocity  (not  generally  wind)  vector  onto  the  local  horizontal  plane  (see  Fig.  5.23). 

heuristic  Defined  by  Webster  as  “valuable  for  stimulating  or  conducting  empirical  re¬ 
search  but  unproved  or  incapable  of  proof.”  In  discussion  of  optimal  estimation 
techniques,  heuristic  carries  the  meaning  “not  mathematically  rigorous.”  In  pat¬ 
tern  recognition  [212:18-19],  heuristic  carries  the  added  meaning  “based  on  human 
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intuition  and  experience,”  so  that  artificial  intelligence-based  pattern  recognition 
procedures  are  considered  to  fall  into  the  heuristic  category. 

high  range  resolution  (HRR)  A  term  used  to  refer  to  radar  systems  which  can  isolate 
reflected  returns  from  one  pulse  into  partitioned  intervals  of  return  time  such  that 
the  speed  of  light  multiplied  by  one  time  interval  equates  to  a  range  extent  at  the 
target  that  is  much  smaller  than  the  total  extent  or  projection  of  the  target  along 
the  sensor-to-target  vector.  Each  such  range  or  time  interval  equivalent  is  a  range 
“bin.” 

hypothesis  From  Webster,  a  proposition  or  supposition  tentatively  accepted  to  provide  a 
basis  for  further  investigation  (e.g.  in  our  case,  a  choice  of  a  priori  likely  parameter 
sets  for  testing  against  observed  measurements  from  an  unknown  object  or  parameter 
set). 

hypothetical  aspect  angle  sphere  A  hypothetical  sphere  of  unit  radius,  centered  on 
the  origin  of  the  object  body  frame  and  fixed  with  respect  to  the  body  frame  axes. 
Provides  a  physical  representation  for  the  concept  of  a  two-dimensional  aspect  angle 
path,  region  or  space  (these  terms  defined  above).  See  Fig.  1.2. 

Kalman  filter  A  class  of  mathematical  estimators  designed  to  make  optimal  estimates  of 
quantities  (states)  for  which  time  rate  of  change  (propagation)  can  be  described  as 
linear  equations  driven  by  white  Gaussian  noise  of  known  statistics  and  deterministic 
inputs,  and  for  which  measurements  are  available  which  can  be  described  as  a  linear 
combination  of  some  or  all  of  the  states  and  inputs,  plus  additive  white  Gaussian 
noise  of  known  statistics.  The  term  “optimal”  is  with  respect  to  a  number  of  pos¬ 
sible  criteria  for  judging  estimator  optimality:  the  estimate  is  the  mean,  mode,  and 
median  of  the  conditional  density  of  the  estimated  variable,  conditioned  on  available 
measurements;  the  estimate  is  the  minimum  mean-square  error  estimate;  and  so  on 
(see  [153:231-236]). 

kinematic  Having  to  do  with  the  branch  of  mechanics  (kinematics)  that  deals  with  aspects 
of  motion  (position,  velocity,  acceleration,  and  so  on  in  translational  and  rotational 
degrees  of  freedom)  apart  from  considerations  of  mass  or  force  (see  [144]).  Generally 
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used  in  multisensor  fusion  to  refer  to  measured  quantities  related  directly  to  object 
translation,  e.g.,  sensor-object  relative  position,  or  range  and  spatial  angle  quantities, 
and  their  derivatives. 

likelihood  function,  generalized  likelihood  function  The  classical  likelihood  func¬ 
tion  p( z  |  Wi)  is  the  probability  or  probability  density  function,  if  the  latter  exists, 
of  obtaining  a  particular  measurement  z,  given  that  we  are  observing  an  object 
of  class  (i.e.,  state  and  parameter  set)  a;,  -  with  the  maximum  likelihood  classifier 
assigning  an  unknown  object  to  the  class  for  which  this  likelihood  function  is  maxi¬ 
mized.  A  generalized  likelihood  function  L  is  a  function  of  z  and  u >i  defined  for  some 
identification  problem,  ideally  such  that  the  maximum  likelihood  function  value  for 
measurements  from  a  true  system  with  states  and  parameters  is  obtained  from 
the  generalized  likelihood  function  optimized  for  that  set  of  states  and  parameters, 
so  that  a  correct  state  and  parameter  set  identification  can  be  made.  The  likelihood 
function  value  need  not  provide  a  probability  measure  or  analogous  quantity  (e.g.,  a 
Dempster-Shafer  mass).  See  Sect.  2.6. 

morph  A  short  form  of  the  expression  morphological  transform  -  generally,  a  change  in 
the  shape  of  a  two-  or  three-dimensional  physical  object  (see  [22]). 

motion  warping  As  defined  for  this  research,  the  process  of  using  dynamic  programming- 
based  sequence  comparison  techniques  to  compare  (1)  a  sequence  of  observed  fea¬ 
ture  vectors  from  a  object  of  unknown  class  with  (2)  object  library  models,  using 
measurements  of  object  kinematics  and  dynamic  restrictions  for  each  class  known 
(approximately)  a  priori ,  to  determine  the  most  likely  object  class  to  have  generated 
the  observed  sequence  of  feature  vectors  and  kinematic  measurements. 

multisensor  fusion  From  [218:1]  (given  there  as  a  definition  for  data  fusion,  but  equally 
applicable  for  multisensor  fusion),  “a  multilevel,  multifaceted  process  dealing  with 
the  detection,  association,  correlation,  estimation,  and  combination  of  data  and  in¬ 
formation  from  multiple  sources  to  achieve  refined  state  and  identity  estimation,  and 
complete  and  timely  assessments  of  situation  and  threat. As  employed  in  this  re¬ 
search,  the  term  multisensor  may  refer  to  different  elements  of  information  obtained 
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from  the  same  piece  of  physical  hardware:  e.g.,  object  position  measurements  and 
HRR  signatures  obtained  from  the  same  radar  assembly. 

nonkinematic  Used  in  multisensor  fusion  to  refer  to  measured  object  quantities  not  di¬ 
rectly  indicative  of  (and  not  having  the  dimensions  of)  translation  or  rotation  states 
and  higher  derivatives  thereof,  e.g.,  feature  observables,  sensor  signatures,  etc. 

nonparametric  Decision  theoretic  pattern  recognition  or  classification  techniques  which 
do  not  assign  a  probability  measure  of  class  membership  to  points  based  on  their 
position  in  feature  space,  making  instead  a  class  membership  decision  for  a  given 
measurement  based  on  the  distance  (in  some  defined  metric)  in  feature  space  from 
the  measurement  to  points  corresponding  to  known  classes.  Also  used  to  refer  to 
techniques  for  estimating  the  probability  density  function  of  a  process  without  a 
priori  knowledge  as  to  the  form  (Gaussian,  Chi-square,  etc.)  of  the  density. 

object-to-sensor  vector  or  unit  vector  The  vector  (or  corresponding  unit  vector)  as¬ 
sumed  to  originate  at  the  origin  of  the  object  body  frame  and  terminate  at  the  sensor 
aperture. 

occlusion,  occluded  An  occlusion  is  an  obstruction  in  the  line  of  sight  between  an  object 
and  a  location  which  prevents  all  or  part  of  the  ( occluded)  object  from  being  observed 
at  the  given  location. 

off-nominal  A  term  used  in  this  research  to  refer  to  errors  between  (1)  the  true  object¬ 
sensor  aspect  angle  at  any  point  in  time  and  (2)  the  corresponding  object-sensor 
aspect  angle  as  estimated  from  measured  object  kinematics  and  an  assumption  of 
object  class,  such  that  the  error  so  described  does  not  lie  on  the  aspect  angle  path 
for  both  true  and  estimated  aspect  angle  sequences. 

one-to-one  From  [6:36],  a  term  used  to  describe  a  function  F  for  which  each  element 
in  the  range  F(S)  of  the  function  corresponds  to  only  one  element  in  the  domain 
S  [6:36]. 

on- nominal  A  term  used  in  this  research  to  refer  to  errors  between  (1)  the  true  object- 
sensor  aspect  angle  at  any  point  in  time  and  (2)  the  corresponding  object-sensor 
aspect  angle  as  estimated  from  measured  object  kinematics  and  an  assumption  of 
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target  class,  such  that  the  described  error  lies  along  the  aspect  angle  path  for  both 
true  and  estimated  aspect  angle  sequences. 

onto  From  [6:35],  a  word  describing  the  relationship  between  a  function  F,  its  domain  5, 
and  a  set  T  containing  the  range  F(S)  of  the  function,  in  which  F(S)  =  T. 

parameter  A  quantity  of  interest  to  an  estimator,  fundamental  to  defining  the  behavior 
of  a  system  over  time  as  for  “state”  (defined  below),  but  which,  unlike  states,  is 
assumed  not  to  change  significantly  over  time  periods  of  interest. 

parametric  A  term  used  to  describe  decision  theoretic  classification  or  pattern  recognition 
techniques  which  explicitly  assess  some  measure  of  the  probability  of  observed  events 
(i.e.,  following  the  second  meaning  given  below  for  statistical  pattern  recognition). 
Also  used  to  refer  to  techniques  for  estimating  the  probability  density  function  of 
a  process,  where  the  form  (Gaussian,  Chi-square,  etc.)  of  the  density  is  known  o 
priori. 

pose  estimate  An  estimate  of  the  aspect  angle  presented  by  some  object,  as  viewed  by 
a  sensor. 

PSRI  An  abbreviation  for  “Position-,  Scale-,  and  (In-Plane)  Rotation-Invariant.”  This 
phrase  means  that  the  value  so  termed,  corresponding  to  an  observation  of  some 
object  by  some  sensor,  will  be  the  same,  (1)  independent  of  position  changes  normal 
to  the  sensor-object  vector  (so  long  as  the  object  remains  in  the  field  of  view),  (2) 
independent  of  scale  changes  as  from  changes  in  sensor-object  range  or  magnification, 
and  (3)  independent  of  rotations  of  the  sensor  relative  to  the  object  about  the  sensor- 
object  vector  only.  This  term  does  not  imply  invariance  of  the  described  value  with 
respect  to  object  aspect  angle  changes.  Also,  quantities  that  are  theoretically  PSRI 
may  not  be  so  in  practice,  due  to  factors  like  pixel  orientation  and  size. 

radar  cross  section  (RCS)  From  [16:A-14]:  a  measure  of  the  reflective  strength  of  a 
radar  target;  usually  represented  in  square  meters  (or  decibel  square  meters  -  dBsm), 
and  defined  as  4tt  times  the  ratio  of  (I)  the  power  per  unit  solid  angle  scattered  in 
a  specific  direction  to  (2)  the  power  per  unit  area  in  a  plane  wave  incident  on  the 
scatterer  from  a  specified  direction. 
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range  sweep  The  return  or  output  realized  by  reflecting  one  pulse  from  a  high  range 
resolution  radar  against  a  target,  expressed  as  radar  cross  section  in  each  of  many 
“range  bins”  in  some  multiple-bin  interval  along  the  sensor-target  vector  (may  also 
refer  to  the  summed  results  from  many  such  pulses). 

residual  the  difference  between  measurements  observed  and  predicted  (generally,  by  an 
estimator)  for  some  measurement  event  (see  Sect.  2. 3. 1.1). 

roll  angle  An  Euler  angle  about  the  velocity  frame  x„  axis,  taken  prior  to  sideslip  angle 
and  angle  of  attack,  first  in  the  sequence  of  Euler  angles  required  to  rotate  the  velocity 
frame  into  the  body  frame  (see  Fig.  5.24).  This  is  not  the  same  as  bank  angle  (see 
above). 

segmentation  A  general  term  for  the  process  by  which  a  sensor  (usually  an  imaging  sen¬ 
sor)  system  separates  potential  objects  from  probable  background  clutter,  generally 
prior  to  processing  the  separated  or  segmented  objects  for  recognition  or  discrimi¬ 
nation,  although  segmentation  may  be  performed  in  combination  or  iteratively  with 
other  processes. 

sideslip  angle  The  angle,  generally  and  in  this  work  denoted  /?,  between  an  aircraft’s 
wind  vector  (or  velocity  frame  unit  vector  x„ ,  for  an  atmosphere  at  rest  with  respect 
to  the  inertial  frame  as  assumed  in  this  work)  and  the  (perpendicular)  projection  of 
that  vector  on  the  body  x4  -  ih  plane.  See  Fig.  5.23. 

slide  distance  A  distance  defined  by  Algorithm  Q  (see  above)  which  attempts  to  quantify 
the  difference  between  two  high  resolution  radar  range  sweeps,  in  accordance  with  a 
metric  defined  by  General  Dynamics  (see  Sect.  2.2.3). 

smoother  An  estimator  which  makes  estimates  of  some  quantity  (state,  etc.)  at  some 
time,  based  or  conditioned  not  only  on  measurements  prior  to  that  time,  but  also  on 
measurements  taken  after  that  time  (see  Sect.  2. 3. 1.2). 

state  hVom  [153:26]:  the  state  of  a  system  at  any  time  t  is  a  minimum  set  of  values 
*i(t), . . . ,  xn(t)  (an  n-dimensional  vector),  which,  along  with  the  input  to  the  system 
for  all  time  r,  r  >  t,  is  sufficient  to  determine  the  behavior  of  the  system  for  all 
r  >  t. 
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statistical  pattern  recognition  A  term  used  to  refer  either  to  decision  theoretic  (see 
above)  pattern  recognition  concepts  in  general,  or,  (more  precisely)  to  decision  the¬ 
oretic  methods  which  define  a  probability  or  classical  likelihood  of  class  membership 
based  on  the  location  of  a  feature  observable  measurement  in  the  feature  space  (the 
latter  also  called  “parametric”  classification  methods). 

structural  pattern  recognition  A  family  of  pattern  recognition  concepts  which  assign 
class  membership  based  on  the  type,  number,  and,  in  some  sense,  relationships  or 
structure  between  observed  features  for  some  object  of  unknown  class,  and  features 
for  known  classes. 

sup  An  abbreviation  for  supremum,  or  least  upper  bound  of  a  set  [6:9].  If  the  set  has  a 
maximum  element,  that  element  is  also  the  supremum. 

syntactic  pattern  recognition  An  alternative  name  for  structural  pattern  recognition 
(see  above),  reflecting  the  language-based  origin  of  many  techniques  in  this  area  [90]. 

target  acquisition  and  tracking  The  process  of  identifying  a  potential  target  using  a 
sensor  system  and  following  the  motion  of  that  target  over  time. 

warping  path  A  particular  sequence  of  associations  between  elements  of  two  feature  ob¬ 
servable  sequences.  Due  to  continuity  constraints,  or  rules  for  the  associations,  the 
allowable  sequences  of  associations  have  the  appearance  of  paths  through  the  warping 
path  region  or  “space”  (see  Fig.  2.8). 

warping  path  cost  The  total  cost  or  distance  associated  with  a  particular  sequence  of 
associations  between  elements  of  two  feature  observable  vector  sequences.  The  path 
having  minimum  warping  path  cost,  subject  to  various  rules  for  the  associations,  is 
taken  to  identify  the  expansions,  compressions,  insertions,  and  deletions  which  make 
the  two  sequences  most  similar  to  each  other  (see  Eqn.  (2.29)). 

warping  path  region  or  “space”  The  finite  set  of  all  possible  associations  between  el¬ 
ements  of  two  finite  sequences  of  feature  observable  vectors,  from  which  sets  of  as¬ 
sociations  may  be  defined  under  applicable  rules  to  define  warping  paths.  These 
spaces  are  illustrated  for  (1)  warping  of  two  one- dimensional  sequences  in  Fig.  2.8 
(producing  a  “two-dimensional”  warping  path  region  or  space)  and  for  (2)  warping 
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of  a  one-dimensional  observed  sequence  against  a  two-dimensional  region  of  possi¬ 
ble  sequences  in  Fig.  2.9  (producing  a  “three-dimensional”  warping  path  region  or 
space). 
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Appendix  B.  Background  Information  on  Pattern  Recognition 

This  appendix  is  intended  to  provide  a  tutorial  overview  of  pattern  recognition  for 
those  without  a  background  in  the  field.  It  should  be  read  following  the  brief  overview  in 
Sect.  2.2. 

B.  1  Taxonomy  of  Pattern  Recognition  Concepts  for  3-D  Objects. 

The  problem  with  taxonomies  for  pattern  recognition  is  that  every  author  in  the 
field  has  his  or  her  own.  More  specifically,  pattern  recognition  is  a  multidimensional  topic 
in  the  mathematical  sense  of  the  word,  and  each  author  makes  distinctions  along  his  or 
her  preferred  directions  in  the  space  of  all  possible  pattern  recognition  concepts.  The 
following  “multidimensional”  taxonomy  is  fundamentally  based  on  the  works  of  the  late 
renowned  K.S.  Fu  [90,  89,  88],  Miclet  [161]  and  Tou  and  Gonzalez  [212],  and  draws  from 
the  discussion  on  concepts  for  “recognition,  tracking,  and  pose  estimation  of  arbitrarily 
shaped  3-d  objects...”  given  by  Gottschalk  et  al.  [102],  further  supplemented  by  material 
from  Duda  and  Hart  [72],  Fukunaga  [91,  92],  Pratt  [178],  Nasr  [169:111-139]  and  others  [55, 
76,  176]. 

Fu  [90]  sets  the  fundamental  distinction  in  pattern  recognition  as  between  “deci¬ 
sion  theoretic”  and  “syntactic”  (or  “structural”)  methods.  Decision  theoretic  classifica¬ 
tion  methods  include  the  classical  nonpar ametric  (feature  space  distance-based  discrimi¬ 
nant  or  decision  function )  and  parametric  (probabilistic,  ideally  Bayesian)  classifiers.  Mi¬ 
clet  [161]  similarly  sets  the  fundamental  distinction  as  one  between  “statistical”  or  “syn- 
tactic/structural”  methods,  including,  as  does  Fukunaga  [91,  92],  both  parametric  and 
nonparametric  classifiers  under  the  heading  of  “statistical.”  By  contrast,  Tou  and  Gon¬ 
zalez  [212]  would  agree  generally  with  Fu’s  taxonomy,  but  use  the  term  “mathematical” 
rather  than  “decision  theoretic,”  and  add  a  third  category,  “heuristic”  (ad  hoc  procedures, 
often  based  on  human  experience,  including  artificial  intelligence  methods).  In  contrast 
to  Miclet  and  Fukunaga,  Tou  and  Gonzales  would  consider  only  parametric  classifiers  to 
be  “statistical”  in  nature.  Note  that  we  specifically  distinguish  the  task  of  classification 
from  (probability)  density  estimation  -  density  estimation  also  can  be  parametric  or  non- 
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parametric,  respectively,  depending  on  whether  or  not  the  form  of  the  density  (Gaussian, 
Poisson,  etc.)  is  known  a  priori. 

In  general,  it  can  be  said  that  decision  theoretic/mathematical,  or  (in  the  broad 
sense  of  Miclet  and  Fukunaga)  statistical  classification  concepts  are  those  which  require 
measurement  or  calculation  of  some  quantity  or  quantities  (features)  for  all  known  object 
classes  to  establish  a  library  or  map  in  the  feature  space  (equivalently,  in  the  terminology 
established  in  Chapter  I,  the  feature  observable  space).  The  recognition  process  then  seeks 
to  establish  class  membership  for  an  object  of  unknown  class,  based  on  the  nearness  (in 
some  sense)  of  the  object’s  measured  features  to  elements  in  the  library  or  points  on  the 
map.  Moreover,  the  parametric,  or  “true”  statistical  classifier,  provides  a  relative  measure 
or  estimate  of  the  probability  of  the  object’s  membership  in  any  given  class. 

In  structural  or  syntactic  pattern  recognition,  on  the  other  hand,  we  seek  to  define 
a  “grammar”,  whereby  each  object  class  corresponds  to  a  particular  order  of  “pattern 
primitives”  -  subpatterns  which  are  readily  recognized  by  the  classifier.  Pattern  primitives 
in  a  syntactic  classifier  could  be  identical  to  the  features  used  in  some  decision  theoretic 
classifier  -  the  key  difference  is  the  syntactic  classifier’s  concern  with  order  of  presenta¬ 
tion.  A  syntactic  description  of  an  object  is  generally  some  form  of  string  or  sequence, 
and  recognition  of  an  unknown  object  is  reduced  to  comparing  its  observed  pattern  prim¬ 
itive  sequence  with  sequences  for  known  classes.  Syntactic  methods  commonly  make  use 
of  analysis  techniques  and  terms  applied  to  the  study  of  human  and  animal  languages. 
Significantly  for  our  later  purposes,  Miclet  [161]  noted  the  association  between  syntactic 
pattern  recognition  and  dynamic  programming  sequence  comparison  techniques  for  speech 
processing  (Fu  was  undoubtedly  aware  of  this  relationship,  but  did  not  discuss  it  in  de¬ 
tail  in  works  known  to  this  author,  cited  above).  This  association  is  further  discussed  in 
Sect.  2.4.5. 

The  approaches  of  Fu,  Miclet,  and  Tou  and  Gonzalez  can  be  compared  with  that 
discussed  by  Nasr  [169:111],  who  saw  from  his  multisensor  fusion/object  recognition  per¬ 
spective,  a  fundamental  distinction  between  “statistical”  and  “model-based”  approaches. 
There  is  an  important  dichotomy  here  in  approach  between  Nasr  and  his  community  with 
respect  to  the  former  authors.  Nasr  is  concerned  with  information  storage  techniques  -  his 
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pattern  recognition  processing  techniques  are  all  basically  decision  theoretic  or  heuristic 
(artificial  intelligence)  in  nature.  To  Nasr,  a  “statistical”  pattern  recognition  system  is  one 
which  maintains  a  library  database,  or  mapping,  or  set  of  decision  surface  parameters  for 
feature  information  recorded  a  priori  for  objects,  aspect  angles,  and  conditions  of  interest. 
This  is  certainly  the  original  and  most  common  way  to  store  data  for  decision  theoretic 
classifiers,  but  the  association  between  data  storage  method  and  classification  method  is 
not  exclusive  from  either  direction. 

Similarly,  the  potential  exists  for  confusion  between  the  syntactic  or  structural  ap¬ 
proach  of  Fu  et  at.  and  Nasr’s  models  in  the  sense  that  structural  approaches  often  specify 
some  “model”  as  the  source  of  the  distinctive  sequence.  But  to  Fu  et  al,  a  model  would 
most  often  be  an  abstraction  -  an  automaton  [88,  90,  212,  70]  or  one  of  its  analogs  (e.g., 
a  hidden  Maikov  model  [176,  68]),  generating  a  sequence  of  pattern  primitives  with  a  dis¬ 
tinctive  structure,  while  Nasr’s  model  is  a  software  simulation  of  a  real  three-dimensional 
object  like  a  tank  or  an  aircraft,  and  the  physics  that  produce  observables  of  interest. 
Moreover,  the  output  of  Nasr’s  model  would  not  generally  be  a  characteristic  sequence, 
and  might  well  be  a  predicted  measurement  vector  or  function  no  different  in  form  from 
those  maintained  in  the  database  of  his  “statistical”  classifier. 

Gottschalk  [102]  on  the  other  hand,  working  from  the  classical  image  recognition 
perspective,  made  the  distinction  between  pattern  recognition  techniques  using  (1)  global 
descriptors  or  (2)  point  correspondences  as  we  will  discuss  below.  Briefly,  global  descriptors 
are  mathematical  quantities  that  are  defined  in  general  by  the  entirety  of  an  observed  object 
at  any  particular  aspect  angle.  If  any  element  of  the  object  (as  observed  or  measured  by 
the  sensor)  changes,  then  the  global  descriptor  value  may  change.  Alternatively,  point 
(or  locus  of  points)  correspondence  methods  require  the  classifier  to  identify  the  presence 
or  absence  of  some  particular  point,  line  segment,  entity  or  combination  thereof  on  the 
object,  such  as  one  would  perceive  when  viewing  the  object,  in  comparison  with  a  priori 
data  on  likely  object  classes.  Gottschalk’s  view  of  data  storage  by  library  database  or 
model  representation  is  similar  to  Nasr’s. 

Relating  Gottschalk’s  work  to  Fu  et  al.  and  Nasr,  we  note  that  global  descriptors 
are  commonly  used  as  features  in  decision  theoretic/mathematical  classifiers,  while  point 
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correspondence  systems  could  be  classed  as  either  heuristic,  syntactic,  or  decision  theoretic, 
depending  on  the  operations  performed  using  the  point-based  information.  To  Gottschalk’s 
two  distinctive  techniques  for  image  recognition  we  will  add  and  distinguish  those  that  use 
correlations  (which  could  be  considered  a  kind  of  pointwise  correspondence).  Correlations 
should  probably  be  considered  as  a  very  special  form  of  decision  theoretic  classifier,  since 
they  can  rather  directly  provide  a  numerical  “distance”  value  indicating  similarity  or, 
perhaps  with  proper  design  and  training,  probability  of  class  membership.  Inherently, 
however,  the  correlation  process  considers  structured  information. 

A  particular  caution  is  in  order  at  this  point  regarding  the  use  of  the  term  “feature.” 
In  discussions  on  global  descriptors,  the  term  “feature”  is  ge.  rally  synonymous  with  the 
descriptor  itself  -  a  numerical  quantity  derived  from  a  sensor,  a  particular  value  of  which 
defines  a  particular  location  in  some  “feature  space.”  In  discussions  on  correspondence 
methods,  however,  a  feature  is  that  particular  point,  line  segment,  entity,  or  combination 
thereof  on  the  object  -  sometimes  called  a  “landmark'  [102]  -  which  we  desire  to  place  in 
correspondence  with  the  like  entity  on  the  correct  object  library  representation. 

B.2  Decision  Theoretic  Object  Recognition. 

Combining  the  decision  theoretic  concepts  of  Fu  et  al.  [90]  with  the  terminology  for 
data  storage  techniques  used  by  Nasr  [169],  we  now  contrast  the  classical  nonparametric 
and  parametric  (Bayesian  and  maximum  likelihood)  approaches  to  object  recognition  as 
widely  used  in  the  multisensor  fusion  community. 

B.2.1  Decision  Theoretic  Methods  -  Survey.  Recall  that  parametric  or  proba¬ 
bilistic  decision  theoretic  classifiers  and  the  clustering  of  features  in  feature  spaces  for  a 
priori  known  objects  and  aspect  angles  were  discussed  in  Sect.  2.2.1.  By  comparison,  in 
a  typical  nonparametric  classifier,  we  use  the  same  a  priori  information  by  measuring  the 
features  of  an  unknown  object,  and  then  find  the  cluster  closest  in  some  distance  sense 
to  the  unknown  object’s  feature  set  or  vector.  The  factors  which  defined  this  cluster  are 
taken  then  to  indicate  the  unknown  object’s  class  and  orientation  (often  a  discretized  value 
reflecting  an  orientation  within  some  partitioned  aspect  angle  interval  or  ‘bin”).  Clearly, 


the  choice  of  metric  and  definition  of  “close”  in  our  multidimensional  feature  space  are 
critical,  as  are  the  centroid-to-centroid  separation  and  dispersion  of  the  clusters. 

One  definition  of  distance  or  closeness  calls  for  each  object  class/aspect  combination 
to  be  represented  by  a  cluster  centroid  (mean)  or  single  prototype  point,  so  that  the  shortest 
measurement-to-prototype  distance  “wins”  (i.e.,  the  unknown  object  is  taken  to  belong  to 
the  class  of  the  nearest  prototype)  [212:77].  Another  definition  of  closeness  is  established 
by  defining  decision  surfaces  or  functionals  in  the  feature  space  to  separate  the  clusters  - 
these  functionals  operate  on  observed  feature  values  (vectors)  to  produce  scalars  -  measures 
of  distance  betw  :en  the  observed  values  and  the  surfaces.  Precise  locations  for  decision 
surfaces  may  be  defined  automatically  through  a  “supervised  training”  process  -  providing 
the  classifier  with  a  number  of  observations  labelled  as  to  which  object  class  produced 
each,  and  allowing  the  decision  surface  parameters  to  converge  to  appropriate  values. 

If  the  clusters  are  of  different  dispersion  or  “diameter,”  but  completely  separated 
(in  the  sense  that  their  convex  hulls  do  not  intersect),  we  will  probably  prefer  to  define  a 
system  of  separating  hyperplanes  (linear  functionals  [170])  or  other  decision  surfaces  such 
that  the  position  of  a  measured  feature  vector  relative  to  the  surfaces  will  associate  it  with 
a  particular  cluster  [212:40-48].  If  the  clusters  overlap  but  their  component  points  are 
equally  likely  to  occur  overall  (or  if  the  relative  number  of  points  in  the  clusters  reflect 
the  a  priori  likelihood  of  the  conditions  associated  with  each  cluster),  we  may  prefer  to 
use  a  “K-nearest  neighbor”  technique  -  assigning  the  unknown  object  to  the  pattern  class 
which  has  the  largest  number  of  a  priori- defined  data  elements  in  the  total  set  of  “K”  data 
elements  that  are  closest  to  the  unknown  object  feature  vector  [212:81-83].  As  an  example 
of  how  these  various  approaches  relate,  observe  that  if  each  cluster  has  only  one  point,  then 
the  (single)  nearest  neighbor  technique  is  equivalent  to  the  “nearest  prototype”  method. 

Relating  parametric  to  nonpar ametric  approaches,  we  may  observe  that  for  two  clus¬ 
ters  with  Gaussian  distributions,  the  decision  surface,  or  locus  of  points  in  feature  space 
that  defines  the  boundary  between  points  with  higher  probability  of  belonging  to  one  clus¬ 
ter  versus  belonging  to  the  other,  is  described  by  a  quadratic  (more  properly,  for  higher 
dimensions,  hyper-quadratic)  equation  -  defining  the  quadratic  classifier.  rf  the  distribu¬ 
tions  (covariances)  of  the  two  clusters  are  identical,  the  surface  reduces  to  a  hyperplane 
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normal  to  and  bisecting  the  line  segment  joining  the  two  cluster  centers,  and  the  decision 
logic  is  then  identical  to  that  of  a  linear  classifier  (i.e.,  a  nonparametric  decision  made  with 
a  separating  hyperplane). 

All  nonparametric  classifier  techniques  result  generally  in  some  object-to-class  assign¬ 
ment,  but  they  do  not  provide  all  of  the  information  we  would  like  to  have  -  in  particular, 
we  need  a  metric  of  class  membership  likelihood  that  can  be  combined  with  results  from 
other  classifiers,  with  each  result  weighted  optimally  according  to  our  confidence  in  the 
individual  classifier.  If  we  have  several  separate  object  recognition  classifiers  providing 
answers  which  we  must  fuse  to  derive  a  final  answer,  this  kind  of  relative  information  is 
critical  -  otherwise,  we  may  be  reduced  to  a  “voting”  system,  in  which  each  classifier  has 
only  one  “vote”  (although  votes  may  be  weighted  heuristically  based  on  our  confidence  in 
the  answer  given  by  a  particular  classifier). 

The  probability  measures  provided  by  parametric  classifiers,  however,  do  provide  a 
metric  suitable  for  comparison  with  other  classifiers.  The  improved  utility  of  information 
and  decision  optimality  provided  by  a  (true)  statistical  or  parametric  classifier  over  a  non¬ 
parametric  classifier  in  multisensor  fusion  and  object  recognition  applications  has  brought 
about  a  general  preference  for  parametric  classifiers  over  nonparametric  classifiers  for  as¬ 
sociating  measurements  with  a  priori  object  feature  information,  commonly  stored  in  a 
multidimensional  feature  space  representation  or  library  database.  This  habitual  associ¬ 
ation,  coupled  with  the  terminology  differences  between  authors  noted  above,  commonly 
leads  to  the  somewhat  ambiguous  application  of  the  term  “statistical”  to  the  classification 
technique,  the  data  storage  technique,  or  the  combination  of  the  two. 

B.2.2  Statistical  Library  Approaches.  The  “statistical  library”  or  database/fea- 
ture  space  mapping  is  the  original  and  most  common  method  for  storing  a  priori  pattern 
class  information  and  implementing  decision  theoretic  classification  concepts.  Considering 
this  database  as  a  mapping  of  recorded  feature  values  from  known  classes  into  our  feature 
space,  decision  theoretic  concepts  like  separating  hyperplanes  are  readily  envisioned.  All 
such  libraries  are  built  by  measuring  or  calculating  and  recording  values  of  our  chosen 
features  for  selected  values  of  object  class,  orientation,  and  other  conditions  (operating 
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state,  background,  etc.).  In  recent  object  recognition  research  reviewed  by  the  author,  these 
features  generally  have  been  global;  e.g.,  see  [42,  43,  44,  94,  101,  169,  8].  Measurements 
may  be  extracted  from  actual  objects  or  models,  and  preserved  as  such  or  reduced  to 
decision  surface  functions,  but  the  object  or  model  form  is  not  preserved  as  such  in  the 
database. 

An  increasingly  popular  alternative  to  classical  decision  theoretic  techniques  with 
“statistical”  library  databases  is  provided  by  artificial  neural  net  (ANN)  methods  [185,  219, 
174]  and  other  “trainable”  classifiers.  It  has  been  recognized  for  some  time  that  “multi¬ 
layer  perceptron,  feedforward”  neural  nets  and  a  variety  of  related  classifier  concepts  can 
be  “trained”  in  a  “supervised”  fashion,  so  that  a  trained  net  provided  with  an  unlabelled 
observation  will  output  or  indicate  the  appropriate  object  class.  The  neural  net  is  both  a 
data  storage  technique  and  a  classifier. 

The  significant  advantage  of  neural  nets  and  similar  trainable  classifiers  over  classical 
clustering  and  statistical  classifiers  is  simply  that  the  designer  expends  no  effort  in  defining 
clusters,  decision  surfaces,  or  object  class  probability  densities  in  his  or  her  chosen  feature 
space:  given  an  adequate  classifier  structure  to  begin  with,  the  classifier’s  parameters 
(generally)  converge  automatically  to  or  “learn”  appropriate  values  during  the  training 
process.  In  particular,  unlike  classical  nonparametric  training  approaches,  neural  nets  are 
generally  not  restricted  to  a  particular  functional  form  (linear,  quadratic,  etc.)  of  decision 
surface  -  the  net  constructs  arbitrary  decision  surfaces  as  required  from  elemental  functions 
in  the  net  structure. 

Originally,  the  output  from  the  typical  multilayer  perceptron  feedforward  net  was 
taken  only  as  indicating  the  “most  likely”  or  “nearest  neighoor”  object  class  (source)  for 
the  observed  data  (a  nonparametric  classification).  Recently,  however,  Ruck  [190]  has 
shown  that  this  net  approximates  a  Bayesian  statistical  classifier  under  certain  conditions, 
so  that  the  relative  magnitude  of  output  values  from  the  nodes  representing  the  various 
object  classes  may  be  taken  as  proportional  in  some  sense  to  values  ofp(o;i  |  z)  in  Eqn.  (2.1). 

The  problem,  however,  with  all  statistical  library  approaches  is  their  inflexibility  to 
changing  environmental,  background,  operational,  and  other  conditions  that  affect  object 
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signatures.  By  including  all  possible  such  conditions  in  defining  our  databases  of  feature 
values  for  each  object /aspect,  or  in  estimating  the  classical  likelihood  p( z  |  u\)  for  each  wt) 
or  in  training  a  neural  net,  we  may  increase  ambiguity,  and  may  decrease  the  probability 
of  making  a  correct  decision  in  any  one  particular  set  of  circumstances.  Building  separate 
databases  or  neural  nets  for  each  likely  set  of  circumstances  has  its  own  pitfalls,  principally 
data  storage  problems  and  the  unavoidable  fact  that,  when  approximating  a  multidimen¬ 
sional  continuous  function  (i.e.,  the  set  of  all  possible  object  scenarios)  with  values  at  a 
finite  number  of  discrete  points,  we  will  effectively  never  have  the  correct  combination 
required  for  a  given  implementation. 

B.2.3  Model-Based.  Approaches.  Recognition  of  (1)  the  problems  with  statistical 
library-based  approaches  and  (2)  the  greater  capability  afforded  by  modem  data  processors 
has  led  the  object  recognition  community  to  a  general  agreement  that  model-based  systems 
offer  a  more  promising  approach.  “Model-based,”  in  the  sense  implied  by  Nasr  [169]  and 
others  [218]  implies  that  the  recognizer  carries  some  form  of  3-D  representation  of  each  po¬ 
tential  object  and,  accounting  for  object  aspect,  operation,  environment,  background,  and 
other  conditions,  calculates  on-line  in  near-real  time  what  the  sensor  should  see,  providing 
a  basis  for  comparison  with  actual  measurements. 

The  level  of  model  fidelity  is  completely  open  to  the  designer.  Systems  discussed 
in  current  literature  vary  from  use  of  “wire  frame”  or  stick  models  to  exceptionally  com¬ 
plicated  models  that  consider,  for  example,  heat  flow  between  vehicle  components  (e.g., 
see  [169]),  or  multiple  reflection  of  radar  waves  in  engine  cavities  [21].  Certainly,  suitability 
of  very  highly  complicated  models  for  on-line  implementation  remains  an  open  issue. 

Fundamentally,  model-based  approaches  seek  to  find  an  object  model  and  (generally, 
in  a  3-D  model)  orientation  that  define  the  “nearest  neighbor”  to  the  given  observation. 
Thus  they  are  most  generally  (but  not  exclusively)  suitable  to  nonparametric  classification. 
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B.3  Pattern  Recognition  Using  Global  Descriptors. 

As  noted  above,  global  descriptors  are  numerical  quantities  that  are  defined  by  the 
entirety  of  an  object  at  any  particular  aspect  angle.  If  any  element  of  the  object  as 
measured  by  the  sensor  changes,  then  the  globed  descriptor  value  may  change. 

Generally,  the  term  “global  descriptor”  refers  to  some  value  extracted  from  a  2-D  im¬ 
age.  Resolution  and  identification  of  specific  local  entities  on  the  object  are  not  important 
in  most  global  descriptor  implementations.  The  fundamental  attraction  and  yet  weakness 
of  many  classical  image-extracted  global  descriptors  is  that  they  are  largely  or  completely 
defined  by  the  object  silhouette.  However,  dependence  on  the  “whole  image”  makes  these 
global  descriptors  sensitive  to  errors  by  occlusion  and  segmentation  (see  Appendix  A  for 
definitions).  Another  attraction  of  global  descriptors  is  that  many  are  position-,  scale-,  and 
(in-plane)  rotation-invariant  (PSRI).  In  terms  of  pattern  recognition  for  3-D  objects  from 
arbitrary  aspect  angles,  this  means  that  the  global  descriptors  are  nominally  (excepting 
noise,  environment,  and  background  effects)  a  function  only  of  object  type  and  two  object 
body- to- sensor  angles,  say  azimuth  and  elevation.  Target  scale  (range)  and  rotation  angle 
(“roll”)  about  the  sensor-object  vector  are  irrelevant  in  theory  (practical  factors,  like  pixel 
size  and  orientation,  may  cause  problems  in  practice).  Three  particularly  popular  forms 
of  global  descriptors  are  discussed  in  tue  following  subsections. 

B.3.1  Moments.  Analogously  with  the  concept  of  two-dimensional  mass  moments 
for  a  flat  physical  object,  we  can  define  image  moments  for  a  segmented  object.  The 
moment  calculation  may  treat  each  pixel  within  a  given  segmented  contour  as  having 
equal  “mass,”  or  it  may  weight  each  pixel  according  to  some  additional  information.  For 
example,  pixels  in  IR  imagery  may  be  weighted  according  to  their  image  intensity  -  thus 
a  “hot”  pixel  has  more  “mass”  than  a  “cold”  one. 

Starting  with  these  fundamental  “physical”  moments,  Hu  [111]  derived  a  set  of  PSRI 
features,  commonly  called  “Hu  moments”  that  have  seen  extremely  wide  application  [225]. 
See  either  of  [111,  178]  for  a  complete  listing. 
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B.3.2  Fourier  Descriptors.  Fourier  descriptors  [98,  178,  217,  101]  are  a  class  of 
frequency  domain  descriptions  for  the  silhouette  of  a  segmented  object.  For  example,  we 
may  take  the  Fourier  series  expansion  of  the  curvature  of  the  silhouette,  curvature  being 
defined  as  the  derivative  of  the  tangent  angle  with  respect  to  length  along  the  silhouette 
curve.  Clearly,  the  curvature  is  periodic  and  real  as  we  make  multiple  traverses  of  the 
silhouette  -  thus  we  can  find  the  Fourier  series  expansion  for  the  curvature,  and  extract 
a  finite  number  of  Fourier  coefficients,  which  are  the  descriptors.  Fourier  descriptors  are 
subject  to  “noise”  from  poor  segmentation  and  occlusion,  since  these  error  sources  affect 
silhouette  most  immediately.  Note  that  Fourier  descriptors  are  not  to  be  confused  with  low- 
frequency  Fourier  spatial  frequency  components  extracted  by  digital  signal  processing  or 
optical  correlations,  since  spatial  frequency  involves  image  factors  other  than  the  segmented 
silhouette  (see  Sect.  B.4). 

B.3.3  Miscellaneous  Global  Descriptors.  A  variety  of  other  features  have  been 
defined  and  used  for  object  recognition  with  imaging  sensors.  These  include  “complexity” 
(ratio  of  number  of  edge  pixels  to  number  of  internal  pixels  for  some  segmented  region), 
height-to- width  ratio,  brightness  (intensity),  “texture”,  and  so  on  (see  [218:99],  for  exam¬ 
ple).  For  the  most  part,  these  are  PSRI  (position,  scale,  and  rotation- invariant)  quantities. 
Their  utility  and  significance  in  segmentation  and  recognition  have  been  investigated  in 
several  studies,  e.g.  [186,  190,  226,  225]. 

B.4  Pattern  Recognition  Using  Correspondences. 

Correspondence  methods  require  the  classifier  to  identify  the  presence  or  absence  of 
some  particular  point,  line  segment,  object  or  combination  thereof  (“features”  or  “land¬ 
marks”  for  this  type  of  classifier)  on  the  object,  in  comparison  with  a  priori  data  on  likely 
object  classes  [102,  110].  Correspondence  classifiers  are  increasingly  model-based,  but  clas¬ 
sification  techniques  may  be  heuristic,  syntactic,  or  decision  theoretic.  The  key  to  success 
in  a  correspondence  classifier  is  making  the  object-model-feature  to  observed-object-feature 
association  -  in  general  this  calls  for  higher  resolution  on  the  object  surface  than  required 
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by  globed  descriptor-based  systems.  Also,  correspondence-based  classifiers  are  generally 
not  position-,  scale-,  and  (in-plane)  rotation-invariant  (PSRI). 

B.5  Pattern  Recognition  Using  Correlations. 

Target-image  to  library-image  (template)  correlations  can  be  performed  using  2-D 
digital  image  processing  or  by  (much  faster)  optical  processing  using  the  Fourier  transform 
properties  of  optics  [47,  58,  93,  98,  99,  103,  158,  178,  188,  232],  In  general,  correlation 
systems  are  not  position,  scale,  and  (in-plane)  rotation-invariant,  but  at  least  one  technique 
achieves  these  attributes,  by  transforming  the  object  and  test  images  as  shown  in  Kobel 
and  Martin  [127],  using  the  method  of  Horev  [109].  Range  “maps”  obtained  by  laser  radar 
measurement  of  unclassified  targets  could  be  matched  to  known  targets  for  varied  aspect 
angles  using  forms  of  3-D  correlation. 

Since  correlations  are  extremely  sensitive  to  sensor-object  aspect  angle  [103]  and 
object  operating  state,  they  are  perhaps  best  accomplished  using  object  models  which  can 
be  oriented  and  adjusted  for  operating  conditions,  etc.,  to  produce  the  “best”  correlation 
with  an  input  object.  The  model  producing  the  highest  overall  correlation  value  is  taken 
to  indicate  the  object  class.  Correlations  can  be  performed  using  a  large  library  of  object 
representations  over  various  aspect  angles,  but  this  approach  is  increasingly  less  favored 
due  to  data  storage  requirements. 

B.6  Summary 

In  combination  with  the  ideas  presented  in  Sect.  2.2,  this  appendix  has  presented  a 
brief  overview  of  key  concepts  in  pattern  recognition,  particularly  as  applied  to  military 
targets.  For  further  information  in  this  area,  the  author  recommends  that  the  reader 
scan  the  bibliography  for  titles  of  interest.  General  references  [72,  212,  92,  90,  169,  178, 
70,  174]  are  appropriately  titled.  Other  sources  relating  to  particular  sensors  and  their 
phenomenology  may  also  be  of  interest,  and  can  be  identified  by  their  titles. 
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Appendix  C.  Detailed  Equations  for  Estimation  and  Tracking  Algorithms 
C.  1  Introduction 

This  appendix  primarily  presents  the  dynamics  (state)  and  measurement  equations 
for  the  Kendrick/Maybeck/Reid  [120,  121],  Andrisani/Kuhl/Gleason  [5],  and  Sworder/- 
Hutchins  [208,  209]  estimators.  The  equations  are  presented  here  using  their  original 
variables,  for  which  definitions  are  provided. 

A  less  detailed  description  of  the  Bishop  estimator  is  also  provided.  This  tracking 
algorithm  was  briefly  mentioned  in  Sect.  2. 3. 1.4. 

Finally,  the  equations  for  fixed  point  and  fixed  lag  smoothing  are  given  as  in  [154:16- 
17],  with  minor  comments  regarding  their  implementation  for  this  research.  Additional 
results  for  scenarios  like  those  in  Chapter  V  are  also  presented. 


C.2  The  Kendrick  /  Maybeck  /  Reid  Estimator 


C.2.1  Kinematic  Filter. 

C.2. 1.1  Kinematic  Filter  State  Equations.  Using  the  form  of  Eqn.  (2.14), 
the  state  equations  for  the  kinematic  filter  in  the  Kendrick  et  al.  estimator  are: 
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where: 

pt/aN  B  D  =  position  of  the  target  relative  to  the  attacker  in  inertial  frame  coordinates, 
i.e.,  with  components  taken  along  the  JVorth,  East,  or  Down  axes  of  an  earth-surface  inertial 
or  navigation  frame.  Dot  notation  denotes  time  rate  of  change  of  the  indicated  variable, 
as  seen  in  the  frame  used  for  coordinatization. 

^/ojvbd  =  velocity  of  the  target  relative  to  the  attacker,  as  observed  from  and 
coordinatized  in  the  navigation  (inertial)  frame 

—  acceleration  of  the  target  relative  to  the  attacker,  in  addition  to  accelera¬ 
tion  from  normal  load  (lift),  as  observed  from  rind  coordinatized  in  the  navigation  (inertial) 
frame 

g  =  acceleration  due  to  gravity 

An  =  g(a+/3e7e)  =  the  magnitude  of  normal  load  acceleration,  modeled  as  a  positive 
(non-zero)  mean  random  process  driven  by  the  random  variable  e 

c(t)  —  a  random  variable  which  drives  AN  (strictly,  e  is  a  stochastic  process) 

ct,/3,7  =  parameters  peculiar  to  particular  aircraft  types  and  operating  conditions 

e  =  the  base  of  natural  logarithms 

( In)n,e,d  —  the  north,  east,  and  down  components  of  the  unit  vector  (In)  in  the 
direction  of  normal  load  acceleration 

Tn,e,d  =  correlation  times  for  first-order  Gauss  Markov  models  used  to  model  accel¬ 
erations  other  than  that  due  to  (nominal)  normal  lift 

Te  =  correlation  time  for  the  first  order  Gauss  Markov  process  modelling  the  behavior 

of  e 

Va/iN ,E ,d  =  acceleration  of  the  attacker  (sensor)  relative  to  the  navigation  (inertial) 
frame,  coordinatized  in  the  navigation  frame  (assumed  available  from  the  attacker’s  inertial 
navigation  system  -  errors  implicitly  considered  in  noise  parameters  listed  above) 

Wv^VE.VD.aaw, =  appropriate  continuous  time  (heuristically)  zero-mean  white 
Gaussian  process  driving  noises,  with  some  strength  Q(t)  as  defined  in  Sect.  2.3.1 
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and  : 


G  = 


03x7 

^7x7 


(C.2) 


C.2.1.2  Kinematic  Filter  Measurement  Equations.  Using  the  form  of 
Eqn.  (2.15),  the  measurement  equations  for  the  kinematic  filter  in  the  Kendrick  et  al. 
estimator  are: 


where: 

r  —  attacker-to-target  range: 


(C.3) 


r  =  [(P./.„)2  +  ( P.l..f  +  (C.4) 

with  corresponding  rate  of  change: 


(Pt/ow^t/aiv)  (Pt/o£,Vt/qB)  ~l~  (Pt/qp^t/gp) 

[(P./aJ2  +  (Pt/aJ2  +  (Pt/ap)2]* 


(C.5) 


•q  =  (azimuth)  angle  between  the  projection  noted  above  and  local  north  at  the 
attacker’s  position: 


V  = 


arctan 


with  corresponding  rate  of  change: 


(C.6) 
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( Pt /  QiV  ^t/gB)  ~  (Pt/  as  jW) 
(P</ajv)2  4"  (Pt/aB)2 


(C.7) 


<r  =  (elevation)  angle  between  the  target/attacker  vector  and  the  vertical  projection 
of  this  vector  onto  the  local  horizon  ted  plane: 


a  =  arctan 


with  corresponding  rate  of  change: 


-Pt/a 


[[(P«/0iV)2  +  (Pt/flJ2]^J 


(C.8) 


^  (Pt/oB^t/aE)]  ^t/ap  [(Pt/ajy)  4~(Pt/aj))  ] 

[(Pt/aJ2  +  (Pt/op)2  +  (Pt/ap  )2]  [(Pt/ojv)2  +  (P*/«b)2]’ 


O'  = 
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z (tj),  the  measurement  at  time  fj,  is  modelled  as  the  sum  of  h[x(<i)]  from  above 
(found  using  truth  state  values)  and  a  vector  of  discrete  time  zero  mean  white  Gaussian 
noise  v(U),  with  an  appropriate  covariance  R(tj) 

and  all  other  variables  are  defined  above. 

C.2.2  Aspect  Angle  Filter. 

C. 2.2.1  Aspect  Angle  Filter  State  Equations.  Using  the  form  of  Eqn.  (2.5), 
the  state  equations  for  the  aspect  angle  filter  in  the  Kendrick  et  al.  estimator  are: 
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where: 
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xp  =  is  azimuth  angle,  first  of  three  Euler  angle  rotations  taken  to  carry  the  navigation 
frame  into  the  body  frame  (see  [79:112-113]) 

9  =  is  elevation  angle,  second  of  three  Euler  angle  rotations  taken  to  carry  the 
navigation  frame  into  the  body  frame 

(j>  =  is  bank  angle,  last  of  three  Euler  angle  rotations  taken  to  carry  the  navigation 
frame  into  the  body  frame 

—  appropriate  continuous  time  (heuristically)  zero-mean  white  Gaussian  pro¬ 
cess  driving  noises,  with  some  strength  Q (t)  as  defined  in  Sect.  2.3.1 


C.2.2.2  Aspect  Angle  Filter  Measurement  Equations.  Using  the  form  of 
Eqn.  (2.6),  the  measurement  equations  for  the  kinematic  filter  in  the  Kendrick  et  al.  esti¬ 
mator  are: 


0. 

» 

4>k 

ek 

4>k 


(C.ll) 


where: 

xp,  =  imaging  sensor- derived  azimuth  angle 
9,  =  imaging  sensor- derived  elevation  angle 
<j>,  =  imaging  sensor- derived  bank  angle 

xpk  =  kinematically-derived  yaw  angle  (an  estimate,  or  pseudo-measurement): 


xj>  =  arctan 


%cos(at)  4Wvsin(at)\  /Flcos(ae)  ANm  sin 

.  i v i  +  i an i  ;v  iui 


in(Qt)y 

•j v  |  / 


(C.12) 
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$k  =  kinematically-derived  pitch  single  (an  estimate,  or  pseudo-measurement): 


0k  =  arcsin 


,  (Vt  cos(ae)  ANm  sin(at) 


in(at)\ 

N  |  / 


(C.  13) 


(j)k  =  kinematically-derived  bank  angle  (an  estimate,  or  pseudo-measurement): 


arc tan 


( VxANy  -  \  / 

V  \v\\an\  a 


Vj  sin(at)  An,  cos(at)' 


(C.14) 


and  H  is  therefore  given  by: 


13X3  03x3 

^3x3  03x3 

and  all  other  quantities  sire  defined  above  (note  that  |  . . .  |  denotes  magnitude  of  the 
quantity  enclosed). 


(C.15) 


C.3  The  Andrtsani  /  Kuhl  /  Gleason  Estimator 

The  following  equations  are  presented  essentially  as  they  appear  in  [5],  with  the 
exception  that  some  changes  were  made  to  correct  apparent  errors.  First,  in  the  original 
source,  the  negative  signs  required  for  the  state  equation  of  the  first  order  Gauss  Markov 
process  noises  bx  y  i  were  not  present  (it  is  possible,  but  not  stated  or  conventional,  that 
a  negative  correlation  time  was  intended).  Second,  it  is  clear  that  the  denominator  in 
the  arctangent  expression  in  the  sixth  element  of  the  measurement  equation  requires  an 
exponent  of  |,  which  is  not  present  in  the  original  source.  Third,  review  of  [5]  and  [187] 
makes  it  clear  that  the  term  k6vP  is  required  in  Eqn.  (C.16),  although  it  is  not  found  as 
such  in  this  equation  as  it  appears  in  [5]. 

C.3.1  State  Equations.  Using  the  form  of  Eqn.  (2.14),  the  state  equations  for  the 
Andrisani  et  al.  estimator  are: 
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[-(/„  -  Iyy)qT  +  *l/?  +  k2p  +  kjr]  4-1 
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*4/4*  o  o  oooooo 

0  k9/Iyy  0  0  0  0  0  0  0 

0  0  Jbis/4z  o  0  0  0  0  0 

0  0  0  0  0  0  0  0  0  [ 

0  0  0  OOOOOO  w2 

0  0  0  OOOOOO  u>3 

0  0  0  *16  0  0  0  0  0  u>4 

0  0  0  0  *16  0  0  0  0  w5  (C.16) 

0  0  0  0  0  *16  0  0  0  w6 

0  0  0  OOOOOO  w7 

0  0  0  OOOOOO  w8 

0  0  0  0  0  0  000  i  wg 

0  0  0  000100 

0  0  0  000010 

0  0  0  000001 

where: 

p,q,r  =  angular  rates  of  the  body  frame  relative  to  the  inertial  frame,  along  the  body 
frame  xb,  y6,and  zb  axes  as  defined  in  Fig.  5.23. 

x,y,z  =  target-to-sensor  distances  along  the  respective  inertial  frame  axes  (first  and 
second  derivatives  with  respect  to  time  as  observed  from  the  inertial  frame  denoted  by  one 
or  two  dots,  respectively) 

ip,6,(j>=  inertial  frame  to  body  frame  Euler  angles  as  defined  for  the  Kendrick  et  al. 
estimator. 

4*,»»,zz  =  moments  of  inertia  about  the  body  frame  xfc,  y6,and  Zj  axes,  respectively. 
L  =  lift  force  magnitude,  =  \p(x2  +  y2  +  z2)SCiaa,  where: 

5  =  equivalent  lifting  surface  area,  and 
C [,a  —  the  coefficient  of  lift 

a  =  angle  of  attack  as  defined  in  Fig.  5.23,  such  that: 


C-8 


sin(a)  =  C I i/p,  where: 
vp  =  ( A 2  +  C2)* ,  with: 

A  =  cos (6)  cos (if>)x  +  cos(0)  sin(^>)y  -  sin(0)z 

C  =  [cos(^)  sin(#)  cos (rp)  +  sin(^)  sin(^>)]i  +  [cos(^)  sin(0)  sin(^i)  -  sin(^)  cos(ip)]y  + 
cos(^)  cos(0)z 

M  —  mass 

g  =  acceleration  due  to  gravity 

bx  y  z  =  a  time-correlated  acceleration  process  noise,  modeled  as  the  output  of  a 
first-order  Gauss  Markov  model 

r  =  correlation  time  constant  for  the  previously  noted  process  noise 

fci,2, 3  . ..,16  =  aircraft-class  dependent  constants,  corresponding  to  quantities  given  in 

[187] 

Wi,2,3,4,5,fi,7,8,9  =  appropriate  continuous  time  (heuristically)  zero-mean  white  Gaus¬ 
sian  process  driving  noises,  with  some  strength  Q(t)  as  defined  in  Sect.  2.3.1 

C.3.2  Measurement  Equations.  Using  the  form  of  Eqn.  (2.15),  the  measurement 
equations  for  the  Andrisani  et  al.  estimator  are: 
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where  subscript  m  denotes  measurement  by  a  pose  estimator,  and 


^1,2, 3, 4, 5, 6, 7, 8, 9  =  discrete  time  zero  mean  white  Gaussian  noises  with  appropriate 
covariance  R(<)  as  defined  in  Sect.  2.3.1;  evidently,  Andrisani  et  al.  chose  to  use  “*/” 
rather  than  “v”  notationally  as  we  have  done  for  this  form  of  variable. 

R  (range),  T)  (azimuth  angle),  and  e  (elevation  angle)  measurements  and  correspond¬ 
ing  rate  measurements  are  defined  exactly  as  for  the  Kendrick  et  al.  estimator. 

and  all  other  quantities  are  defined  above. 

C.4  The  Sworder  /  Hutchins  Estimator 

Recognizing  that  most  “pose”  estimators  assign  the  target  to  one  of  a  finite  number 
of  discrete  orientations  (as  discussed  earlier  in  Sect.  2.2),  say  l  in  number,  and  that  lateral 
acceleration  can  likewise  be  discretized,  to  say  ife  in  number,  Sworder  defines  a  Kronecker 
product  space  [38]  a  ®  p  consisting  of  all  possible  orientation  (a)  and  acceleration  (p) 
combinations.  He  then  notes  that  transitions  in  this  space  occur  according  to  the  theory 
of  marked  point  processes,  as  defined  by  Snyder  [205].  Using  this  theory,  Sworder  et  al. 
are  able  to  define  a  differential  equation  which  governs  the  estimate  of  an  indicator  for  the 
presence  of  the  system  in  any  of  the  k  X  /  states: 


d<j>t  -  QT<i>tdt  +  (diag(<j>t)  -  0t0^)(A(l*  ®  I())T  X  diag(\  )dtrt  (C.18) 

where  (note  that  “x”  denotes  multiplication  in  the  Sworder  development,  not  vector  cross 
product): 

=  a  vector  of  dimension  k  X  /,  each  element  of  which  is  a  value  from  zero  to  one  rep¬ 
resenting  the  probability  that  the  true  target  is  in  that  particular  acceleration/orientation 
state 

Q  =  akxlbykxl  transition  rate  matrix  giving  the  probability  rate  of  transitions 
between  states  -  e.g.,  the  probability  that  the  target  is  in  state  3  at  some  time  t  +  dt,  given 
that  the  target  is  in  state  2  at  time  t  is  equal  to  q2 3  X  dt,  where  q2 3  is  the  element  of  the 
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matrix  Q  in  the  second  row  and  third  column.  Diagonal  elements  are  handled  differently  - 
the  probability  that  the  target  is  in  any  state  z  at  some  time  t  +  dt,  given  that  the  target 
is  in  state  i  at  time  t,  is  equal  to  (1  -  qa)  X  dt 

diag{4>t )  =  akxl  by  kxl  diagonal  matrix,  elements  of  which  are  the  elements  of 
the  vector 

A  =  is  an  /  X  /  matrix,  the  i-j th  (row-column)  element  of  which  is  the  expected  rate 
of  measurements  showing  target  orientation  in  the  zth  aspect  bin,  if  the  target  is  in  fact  in 
the  jth  aspect  bin 

(if  ®  Iz)T  =  the  transpose  of  the  Kronecker  product  of  the  transpose  of  a  k- 
dimensional  vector  of  ones  and  an  /  x  /  identity  matrix,  defining  a  k  X  l  by  1-dimensional 
matrix 

~  _  1 

diag{\  )  =  a  diagonal  matrix  of  dimension  l  X  l,  the  elements  of  which  are  the 
inverse  of  the  elements  of  the  /  dimensional  vector  defined  by  A  x  (if  ®  If)  X  <f>t 

trt  =  a  vector  of  length  l,  each  element  of  which  represents  the  number  of  observations 
with  that  particular  aspect  bin  reading  over  the  entire  measurement  history 

t  -  time 

Like  the  Kalman  filter  state  estimator,  this  estimator  operates  using  a  set  of  propa¬ 
gations  and  updates  [208].  Propagations  are  given  by: 

it  =  QTit  (C.19) 

and  updates  by: 

=  ( diag(<j>t )  -  <£t<^)(A(l£  ®  IL))T  x  diag{\  l)6trt  (C.20) 

(where  “ 6 ”  denotes  an  impulsive  change  in  the  indicated  variables  at  measurement  time, 
and  all  other  variables  are  defined  above).  The  measurement  at  any  given  time  is  the 
apparent  discrete  orientation  value  from  the  pose  estimator.  Note,  as  does  Sworder,  that 
although  Eqn.  (C.18)  is  presented  in  continuous- time  form,  for  point  process  measurements, 
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where  dert  is  zero  except  at  measurement  updates,  Eqns.  (C.19)  and  (C.20)  are  a  correct 
implementation  of  Eqn.  (C.18).  Finally,  the  orientation  and  acceleration  values  define  an 
inertial  acceleration  which  plays  the  part  of  a  reasonably  well  estimated  control  input  (u(<), 
in  Eqn.  (2.14))  for  the  state  equation  of  an  extended  Kalman  filter  which  uses  conventional 
range  and  angle- derived  kinematic  measurements. 

C.5  Nonlinear  Kinematic  Tracker  Effort  by  Bishop. 

Bishop’s  effort,  referred  to  in  Sect.  2.3. 1.4,  is  based  on  work  by  Krener  et  al.  [26,  87, 
131,  132, 130,  45,  215],  and  shows  that,  for  a  particular  target  maneuver  model,  if  a  (one-to- 
one,  onto,  C°°,  generally  nonlinear)  transformation  T:  Rn  — »  Rn,  defining  x  =  T(y)  exists 
by  which  a  nonlinear  state  equation  of  the  form  in  Eqn.  (2.24)  (although  Bishop  explicitly 
defines  his  stochastic  differential  and  integral  equations  “in  the  sense  of  Statonovich,” 
rather  than  Ito)  can  be  transformed  into  an  approximate  observer  (state)  canonical  form: 

y(t)  =  Ay(t)  +  b(y(t))  +  w(t)  (C.21) 

with  continuous-time  measurements  given  by: 

z(f)  =  Cy(f)  -|-  v(t)  (C.22) 

(where  y  is  a  vector  in  the  transformed  space,  A  and  C  Me  block  diagonal  matrices,  b 
is  a  nonlinear  transformation  acting  on  y (<),  and  other  variables  are  as  defined  earlier, 
except  that  the  measurement  noise  v(t)  parameters  now  conform  to  a  continuous-time 
measurement  description),  then  a  stable  estimator  of  the  form  in  Eqns.  (2.16)  and  (2.18) 
(i.e.,  working  directly  in  the  untransformed  or  original  vector  space)  can  be  defined  without 
explicit  computation  of  the  transformation  T.  Bishop  finds  that  this  estimator  provides 
improved  tracking  performance  over  an  extended  Kalman  filter  using  the  same  (planar 
turn)  target  model,  and  proves  that  his  geometric  nonlinear  filter  is  locally  asymptotically 
stable  for  deterministic  inputs,  and  that  the  second  non-central  moment  of  the  estimate 
error  probability  density  is  bounded  for  stochastic  inputs. 
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The  Bishop  development  does  not  address  the  issue  of  the  sensitivity  of  operating 
this  continuous- time  filter  on  a  digital  computer  with  discrete-time  measurements  -  the 
usual  environment  for  modern  estimators.  In  general,  operating  a  continuous- time  filter  as 
a  discrete-time  process,  rather  than  starting  with  a  discrete-time  model  and  then  designing 
the  estimator,  can  lead  to  performance  problems  [153:43,172]. 

C.  6  Equations  and  Additional  Results  for  Optimal  Smoothing 

C.6.1  Overview.  Implementation  of  the  target  recognition  approach  in  Chapter  V 
requires  an  estimate  of  target  acceleration  based  on  kinematic  tracking  information  only. 
For  this  purpose,  the  author  implemented  an  optimal  smoothing  routine.  This  proposal  to 
use  optimal  smoothing  in  a  tactical  application  is  believed  to  be  relatively  new  -  the  only 
published  references  known  to  make  a  similar  proposal  are  by  Mahalanabis  et  al.  [148,  149]. 
Note  that  optimal  smoothing  for  reentry  vehicle  state  estimation  in  strategic  defense  and 
space  operations  is  believed  to  be  relatively  common  (e.g.,  see  [51]). 

The  implementation  of  a  fixed  lag  smoother  actually  starts  with  the  operation  of  a 
fixed  point  smoother  for  as  many  measurement  intervals  as  required  to  achieve  the  desired 
lag  between  the  current  time  and  the  first  time  for  which  the  fixed  lag  smoother  state 
estimate  is  required.  Accordingly,  we  present  the  equations  for  both  forms  of  smoother,  as 
implemented  for  this  effort.  All  equations  are  taken  from  [155:1-18],  which  in  turn  refers 
to  [159]  and  [160],  among  others.  We  also  consider  details  of  smoother  implementation 
using  output  from  the  Kalman  filter  simulator  MSOFE  [167].  All  of  the  information 
required  to  obtain  these  smoother  estimates  is  obtained  from  the  concurrently-operating 
conventional  9-state  Kalman  filter  with  Gauss-Markov  acceleration  (i.e.,  Singer  model  [202], 
as  described  in  Sects.  5.5.2  and  2. 3. 2.1). 

The  smoother  equations  given  here  provide  an  optimal  estimate  for  the  case  of  tar¬ 
gets  described  by  linear  state  models,  driven  by  (heuristically)  continuous  white  Gaussian 
noise,  having  measurements  that  are  a  linear  function  of  the  states,  corrupted  by  additive 
discrete  white  Gaussian  noise.  We  refer  to  these  as  the  classic  “linear,  white,  Gaussian” 
filter  and  measurement  model  assumptions,  as  discussed  in  Sect.  2.3.1. 1.  None  of  these 
requirements  is  ever  actually  satisfied  in  practice.  In  particular,  real  systems  are  infinite 
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dimensional,  while  feasible  estimators  are  inherently  finite  dimensional.  Thus,  our  models 
are  purposefully  of  reduced  order,  and  we  seek  a  level  of  modelling  fidelity  that  balances 
feasibility  with  accuracy. 

In  our  case,  the  (truth  model)  aircraft  targets  were  modelled  as  moving  instanta¬ 
neously  from  linear  flight  to  constant  turn  rate  (i.e.,  constant  speed,  constant  non-zero 
specific  force  normal  to  the  velocity  vector)  turns  lasting  5-10  seconds,  and  then  returning 
to  linear  flight.  Thus,  a  Gauss-Markov  acceleration  filter  model  is  hardly  a  perfect  model. 
However,  the  filter  dynamics  are  linear,  and,  since  measurements  are  given  as  position  and 
one  (doppler)  velocity  component  in  a  sensor  frame,  the  measurement  model  is  linear  with 
respect  to  the  states  (although  the  measurement  matrix  H  changes  with  target  position 
relative  to  the  sensor,  or  effectively,  with  time).  In  any  case,  the  “optimal”  smoother 
performed  quite  well. 

However,  since  smoother- derived  acceleration  states  were  still  rather  more  noisy  in 
our  research  than  desired  for  defining  target  acceleration,  target  acceleration  was  ultimately 
estimated  by  fitting  smoother- derived  position  as  a  function  of  time  to  second-degree  poly¬ 
nomial  curves,  and  differentiating  twice  to  find  acceleration.  This  process  is  also  discussed. 

Finally,  additional  results  are  presented  and  discussed.  Plots  were  generated  using 
Matlab  [151]. 

C.6.2  The  Fixed  Point  Smoother.  The  fixed  point  smoother  gives  the  optimal 
state  estimate  x(t<  |  tj)  for  some  (discrete)  time  ti}  conditioned  on  measurements  and 
corresponding  Kalman  filter  outputs  through  some  equal  or  later  time  tj,  where  x(t,  |  <,) 
is  simply  x(tf )  from  the  concurrently  running  Kalman  filter.  This  smoothed  estimate  is 
found  using  the  following  equations,  for  tj  >  f<+1: 

x(ti  |  tj)  =  x(t,  |  tj_x)  +  W(f,-)[x(«+)  -  x{tj )]  (C.23) 


where: 
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(C.24) 


i- 1 


w(^)  d=f  n  Mtk) = («,•_,) 


k=i 


and: 


A(tfc)  =  PW)*T(4+i,*fc)P'l(*iT+i)  (C.25) 

The  terms  in  these  equations  are  foimd  as  follows: 

x(ft )  —  the  filter  state  estimate  following  update  at  time  tj. 

x(t~)  =  the  filter  state  estimate  prior  to  update  at  time  tj. 

P(f*  )  =  the  filter  covariance  estimate  following  update  at  time  tk. 

P(t^+i)  =  the  filter  covariance  estimate  prior  to  update  at  time  tk+1. 

&T[tk+i,tk)  =  the  transpose  of  the  classical  state  transition  matrix  &(tk+1,tk)  for 
propagation  of  primal  state  variables  from  time  tk  to  tfc+i  (i.e,  &T(tk+i,tk)  is  the  state 
transition  matrix  for  propagating  adjoint  quantities  backward  in  time). 

Note  that  W(t.)  =  Inxn,  an  identity  matrix  of  order  n  (the  dimension  of  x). 

If  the  classical  “linear,  white,  Gaussian”  modelling  assumptions  apply  (as  in  Sect. 
2. 3. 1.1),  the  covariance  P(t<  |  tj)  of  the  estimate  x(£j  |  tj)  (alternatively,  the  covariance  of 
the  error  in  this  estimate)  is  given  by: 


P(*  I  *j)  =  P (*<  I  tj- 1)  +  W(*,-)[P(*t)  -  P(tr)]WT(tj) 

=  P(ti  |  tj. i)  -  W(tJ)K(tJ)H(t>)P(tr)WT(ti)  (C.26) 

where  all  terms  are  defined  above  or  in  Sect.  2.3.1. 1,  and  P(^  |  t{)  is  simply  P(t,f )  from 
the  concurrently  running  Kalman  filter.  If  the  classical  assumptions  do  not  apply,  then 
Eqn.  (C.26)  at  least  provides  an  estimate  of  the  estimate  or  error  covariance. 
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C.6.3  The  Fixed  Lag  Smoother.  The  optimal  state  estimate  for  an  JV-step  time 
lag  is  given  by: 

x(*i+i  U<+iv+i)  =  $(ti+1,ti)i(ti  |  ti+N) 

+C(tj+^+i)K(ti+jv+i)[z 

»+w+i 

+U(ti+1)[x(t,  |  ti+„)  -  x(t+ )] 
where  the  n-by-n  “gain”  matrix  C(ti+JV+1)  is  given  by 

i+N 

C(ti+N+1)=  n  Mh)  =  (C.28) 

fc=*+i 

with  A(tjt)  defined  in  Eqn.  (C.25);  and  the  n-by-n  matrix  U(fi+1)  is  given  by 

U  (ii+1)  =  Gd(tj)Q,1(ti)Gl(t<)*T(^,«<+i)P-1(^)  (C.29) 

Under  the  usual  “linear,  white,  Gaussian”  assumptions,  the  smoothed  estimate  has 
zero-mean  error,  and  the  covariance  of  the  estimate  and/or  error  is  given  by: 

P(t  i+i  |  U+N+i)  -  P(^+i)  -  C(tj+w+1)K(ti+w+i)H(tj+w+1)P(<j+JV+1)CT(ti+Ar+1) 
-A-^tOlP (*+)  -  P (U  I  ti+N) }A~1(ti)T 
=  P(^t+i)  —  G(tj+^+1)[P(<i+jy+1)  —  P(fi+jv+i)]C  (fi+w+i) 

-A-1(<j)[P(f,+)  -  P(t,  |  «i+Jv)]A-1(«4)T  (C.30) 

The  fixed  lag  smoother  is  initialized  using  a  state  estimate  x(f0  |  tN)  and  covariance 
estimate  P(t0  |  tN)  found  by  iterating  the  fixed  point  smoother  for  fixed  point  t0  a  total  of 
N  times.  These  values  become  x(t*  |  ti+s)  in  Eqn.  (C.27)  and  P(tj  |  ti+N)  in  Eqn.  (C.30), 
respectively. 


-  H(ti+„+1)x(ti+JV+1)] 

(C.27) 
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C.6.4  Optimal  Smoothing  with  MSOFE  Output.  In  the  Kalman  filter  simulator 
MSOFE  [167],  all  outputs  are  controlled  from  subroutine  OUT.  For  smoother  implementa¬ 
tion  using  MSOFE  outputs,  the  pre-measurement  update  quantities  x(£“ )  and  P(t“ )  are 
available  after  subroutine  USROUT  is  called  from  subroutine  OUT  with  calling  argument 
IOUT  =  3.  Post -measurement  quantities  x(tf )  and  P (tf)  are  available  after  subroutine 
USROUT  is  called  with  IOUT  =  5. 

Since  MSOFE  performs  sequential  scalar  updates,  the  quantities  K(£j)  and  H(t,)  are 
made  available  vector  by  vector  at  each  scalar  update  after  subroutine  USROUT  is  called 
with  IOUT  =  4  (m  events  per  set  of  scalar  measurements).  These  quantities  are  saved  at 
each  IOUT  =  4  call  and  output  as  matrices  with  post-measurement  state  updates  at  each 
IOUT  =  5  event.  Note  that  K(£j)  is  provided  column-by- column,  while  H(tj)  is  provided 
row-by-row.  Similarly,  the  expected  measurement  vector  H (tj)x(t~)  and  corresponding 
noise-corrupted  measurement  are  made  available  scalar  by  scalar,  and  output  as  vectors 
at  each  IOUT  =  5  event. 

It  is  important  to  understand,  however,  that  the  Kalman  filter  gain  K(ty)  and  ex¬ 
pected  measurement  values  H(tJ  )x(tj)  generated  in  this  way  from  successive  scalar  mea¬ 
surement  updates  in  MSOFE  are  not  the  same  as  those  which  would  be  generated  by  a 
classical  vector  measurement  update  filter.  As  discussed  in  Sect.  4.3,  where  successive  mea¬ 
surements  provide  non-independent  information  about  the  same  states,  or  measurement 
noise  is  not  independent  between  measurements,  we  may  expect  that  this  MSOFE-provided 
sequence  of  expected  measurement  values  and  Kalman  filter  gains  reflect  information  from 
previous  scalar  updates.  In  MSOFE,  the  classical  Kalman  gain  matrix  and  expected  mea¬ 
surement  value  vector  are  not  generally  available  prior  to  the  sequence  of  measurement 
updates. 

For  example,  in  our  case,  each  of  the  three  target  position  measurements  in  general 
provides  some  information  about  target  position  along  each  of  the  three  inertial  axes.  How¬ 
ever,  since  the  “sensor  frame"  axes  are  orthogonal  to  one  another,  and  measurement  noise 
is  considered  independent  on  each  axis,  the  position  measurements  are  in  fact  indepen¬ 
dent  from  one  another  with  respect  to  the  information  each  provides  about  any  one  state. 
Therefore,  we  do  not  expect  that  one  sensor  frame  scalar  position  measurement  update 
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will  change  the  expected  measurement  value  for  a  subsequent  scalar  position  measurement 
in  the  same  measurement  set.  On  the  other  hand,  the  doppler  velocity  expected  measure¬ 
ment  is  not  independent  from  the  prior  measurements:  in  particular,  it  is  affected  by  the 
prior  measurement  of  range,  or  position  along  the  sensor  “boresight”  axis  -  thus  expected 
doppler  measurements  before  and  after  the  scalar  position  measurements  will  differ. 

The  smoother  equations  given  here  and  in  [154:13-14]  apply  to  the  vector  update 
case.  With  careful  manipulation  and  consideration,  however,  these  equations  can  be 
made  optimal  as  well  for  information  available  from  MSOFE,  as  taken  from  sequential 
scalar  updates.  First  of  all,  the  reader  should  note  that  the  quantity  K(ti+N+1)[zi+w+1  - 
H(fi+Ar+i)x(t,“+Ar+1)]  from  Eqn.  (C.27),  or  x(t,^+  JV+i)  -x(f,~+JV+1),  is  exactly  the  same  vector 
quantity  whether  the  Kalman  gain  matrix  K(fi+jy+1)  and  the  residual  vector  [z(ti+Ar+i )  - 
H(ti+Ar+1  )x(t“+Ar+1 )]  are  computed  before  the  measurement  update,  or  assembled  after¬ 
ward  using  Kalman  gain  vectors  and  scalar  residuals  from  successive  scalar  updates.  Sim¬ 
ilarly,  although  the  quantity  K(<j+w+1)H(ti+Ar+1)P(t,“+Ar+1)  will  be  computed  incorrectly 
in  Eqn.  (C.30)  if  K  is  assembled  from  MSOFE  Kalman  gain  vectors,  the  correct  quantity 
is  available  in  any  case  as  P(f~+/V+1)  -  Pfot^+i),  computed  from  filter  covariance  values 
(respectively)  before  any  and  after  all  measurements. 

Note,  however,  that  further  complications  are  to  be  expected  regarding  the  relation¬ 
ship  between  scalar  and  vector  measurement  updates  if  feedback  control,  sensor  failures,  or 
nonlinear  processes  come  into  play  from  scalar  update  to  update.  If  desired,  the  classical 
K (tj)  matrix  can  be  reconstructed  using  pre-measurement  information  and  Eqn.  (2.11), 
and  expected  measurement  values  can  likewise  be  found  prior  to  any  scalar  updates.  For 
the  fixed  point  smoother,  the  need  to  reconstruct  the  matrix  K (tj)  in  this  way  is  avoided 
in  any  case  by  using  the  first  equality  in  Eqn.  (C.26). 

To  limit  complication  in  the  MSOFE  code,  the  author’s  smoother  uses  MSOFE 
scalar-update-generated  Kalman  gains  and  residuals,  subject  to  the  considerations  noted 
above,  subsequently  assembled  into  vector-matrix  form.  These  MSOFE-generated  quan¬ 
tities  were  written  to  a  file  “SMOODAT”  with  a  format  similar  to  that  of  the  MSOFE 
output  file  “CTOM”  (plot  output  for  continuous  variables),  which  is  processed  by  pro¬ 
gram  MPLOT  (a  plot  post-processor  for  MSOFE  output).  The  Smoother/Recognizer 
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program,  then,  was  a  derivation  of  MPLOT  which  routed  input  data  from  “SMOODAT” 
into  smoother  subroutines  rather  than  plot  processing  subroutines. 

The  only  problem  encountered  in  operating  the  fixed  point  and  fixed  lag  smoothers 
was  an  instability  problem  in  the  fixed  lag  smoother  after  3-4  seconds  in  single  precision, 
or  15  seconds  in  double  precision.  At  least  part  of  this  instability  is  believed  due  to  sub- 
optimal  covariance  matrix  inversions  associated  with  Eqn.  (C.25),  in  which  old  information 
is  “backed  out”  of  the  smoother.  The  smoother  was  first  implemented  in  single  precision, 
which  implies  a  4-byte  word  length  on  SPARC  workstations.  The  rapid  development  of 
instability  in  this  mode  was  investigated,  and  it  was  found  that  single  precision  matrix 
inversion,  followed  by  multiplication  of  the  original  matrix  with  its  inverse,  resulted  in 
off-diagonal  terms  as  large  as  10-3  -  i.e.,  significantly  non-zero.  It  was  clear  that  30-40 
iterations  of  this  process  over  3-4  seconds  could  introduce  significant  numerical  error  into 
the  computations. 

Double  precision  inversion  and  double  precision  multiplication  were  required  to  keep 
off-diagonal  terms  smaller  than  10~8.  Running  the  smoother  with  double  precision  op¬ 
erations  allowed  the  algorithm  to  ran  for  lengths  of  time  sufficient  to  prove  the  concept, 
but,  even  in  double  precision,  the  fixed-lag  smoother  will  become  unstable  after  15  or  so 
seconds. 

This  instability  is  but  a  minor  concern  for  at  least  four  reasons.  First,  in  an  ac¬ 
tual  on-line  implementation,  most  smoother  parameters  could  be  kept  as  fixed  gains,  and 
operations  like  Eqn.  (C.25)  would  not  be  implemented  on-line.  Second,  if  it  were  neces¬ 
sary  to  do  so,  the  smoother  can  be  “freshened”  by  restarting  the  recursive  computation 
of  Eqn.  (C.25)  with  stored  values.  Third,  one  may  choose  to  implement  a  “fully  factored” 
or  numerically  stable  form  of  smoother  [31].  Finally,  as  noted,  the  time  period  prior  to 
instability  was  adequate  for  our  purposes  in  any  case. 

C.6.5  Curve  Fitting.  As  discussed  in  Sect.  5.6,  the  smoother- derived  target 
acceleration  estimate  per  se  was  much  closer  to  the  true  target  acceleration  than  the  filter- 
derived  acceleration  estimate,  but  smoother-derived  acceleration  was  still  rather  more  noisy 
than  desired  for  algorithm  functioning.  It  must  be  emphasized  that  the  author’s  algorithm 
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is  designed  to  identify  target  accelerations  relative  to  the  body  frame  that  remain  reason¬ 
ably  constant  within  defined  bounds  (nominally  +/-  5  feet  per  second  squared  over  an  0.8 
second  period)  -  indicative  of  a  steady  state  condition  in  the  target  maneuver.  The  identi¬ 
fication  of  this  condition  and  a  corresponding  nominal  acceleration  level  define  the  start  of 
the  recognition  process  -  for  this  reason,  a  smooth,  if  not  perfectly  accurate,  acceleration 
estimate  is  essential.  Errors  in  the  accuracy  of  the  acceleration  estimate  are  overcome  by 
the  dynamic  programming  “motion  warping”  process. 

The  final  acceleration  estimation  method  selected  was  to  fit  smoother- derived  posi¬ 
tion  to  a  second  degree  (quadratic)  polynomial  curve,  and  differentiate  twice  to  find  accel¬ 
eration.  This  curve  fitting  was  accomplished  with  the  use  of  the  IMSL  utility  “RCURVE” 
[114],  which  provides  least-squares  polynomial  curve  fit  parameters  for  a  curve  of  given 
degree  to  fit  an  array  of  points.  In  this  application,  the  smoother- derived  position  along 
each  inertial  (filter)  axis  is  maintained  as  a  set  of  43  values  at  0.1  sec.  intervals,  covering 
an  elapsed  time  of  4.2  seconds,  and  the  acceleration  is  taken  at  the  midpoint  of  the  curve. 
Thus,  typically,  a  total  of  4.1  seconds  is  required  before  the  curve  fit-derived  acceleration 
estimate  is  available  for  some  given  (real)  time.  That  is,  4.1  sec.  —  2.0  sec.  smoother  lag 
+  2.1  sec.  (  =  0.5  [4.2  sec.])  curve  fit  delay. 

It  should  be  noted  that  the  use  of  the  IMSL  utility  “RCURVE”  is  a  simple,  but 
inefficient  way  to  implement  a  least  squares  curve  fit.  Had  this  curve  fit  been  performed 
internally  to  the  author’s  code,  a  recursive  update  to  the  curve  fit  parameters  would  have 
been  possible,  with  low  computational  load. 

Other  approaches  for  acceleration  state  estimation  by  curve  fitting  to  position  mea¬ 
surements  are  discussed  in  [50]  and  [74].  The  author  does  not  recommend  acceleration 
state  estimation  by  curve  fitting  to  Kalman  filter- derived  position  measurements  per  se, 
for  reasons  that  will  be  clear  in  the  following  section. 

C.6.6  Results  and  Discussion.  In  this  section,  we  present  smoother  results  from 
four  different  tracking  scenarios.  Some  results  shown  in  Chapter  V  were  taken  from  a 
fifth  scenario  -  all  five  scenarios  are  presented  in  Fig.  C.l.  With  reference  to  the  results 
in  Chapter  V,  the  first  scenario  corresponds  to  Figs.  5.26,  5.27,  and  5.29  through  5.31, 
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Figure  C.l.  Target  Trajectories  for  Smoother  Discussion 


while  the  second  scenario  corresponds  to  Figs.  5.28  and  5.32.  These  results  concentrate  on 
acceleration  state  estimation,  since  that  is  by  far  the  most  critical  variable  for  our  purposes. 
The  plots  show  mean  error  in  state  estimates  by  the  (1)  Kalman  filter,  (2)  smoother,  and, 
where  required,  (3)  least  squares  quadratic  curve  fit  to  smoothed  position,  respectively, 
for  Monte  Carlo  sets  of  20  runs  with  the  parameters  given  in  Chapter  V.  Also  shown 
are  the  true  mean  error  and  mean  +  /-  one  standard  deviation  bounds  for  the  filter  and 
smoother,  as  well  as  the  filter  and  smoother-predicted  error  standard  deviations  (filter  and 
smoother-predicted  mean  error  Eire  zero  by  definition).  Standard  deviation  curves  for  the 
true  error  in  the  curve  fit  estimate  are  not  shown  to  maintain  clarity,  but  are  on  the  order 
of  or  slightly  smaller  than  the  corresponding  standard  deviations  for  the  smoother  true 
error. 

In  each  case,  target  accelerations  last  for  eight  seconds,  starting  at  2.0  or  3.0  seconds, 
depending  on  the  scenario  (maneuver  onset  time  is  clear  from  the  plots).  In  all  but  the 
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last  case,  the  Kalman  filter  track  begins  at  0.0  seconds,  and  the  smoothed  estimate  starts 
at  2.0  seconds  and  rims  until  11.9  seconds  (using  a  posteriori  information  from  4.0  to  13.9 
seconds,  respectively  -  a  two-second  time  lag).  The  last  case  uses  a  one-second  time  lag, 
and  the  unique  timing  issues  for  that  case  will  be  noted  below.  The  difference  in  start 
time  for  the  maneuvers  is  simply  to  show  that  the  smoother  performance  is  robust  with 
respect  to  changes  in  start  time.  Figures  C.2  and  C.3  show  the  error  in  east  (inertial  y 
axis)  target  ".cceleration  and  position,  respectively,  for  Scenario  2.  The  position  error  plot 
does  not  show  a  mean  curve  fit  position  error  since  this  plot  follows  the  mean  smoother 
error  curve  exactly. 

Note  that  the  positive  position  estimate  error  indicates  that  the  smoothed  position 
estimate  leads  the  true  position  somewhat,  while  the  filter  position  lags  significantly.  The 
key  point,  alluded  to  above,  regarding  this  filter  position  estimate  lag  is  that  curve  fitting 
to  filter  position  estimates,  followed  by  twofold  differentiation  to  find  acceleration,  did 
not  yield  adequate  acceleration  estimates  -  the  typical  position  estimate  “lag”  as  seen 
here  translates  into  a  low  acceleration  estimate  after  curve  fitting  and  differentiation.  The 
relatively  small  position  lead  error  from  the  smoother  has  an  unnoticeable  effect  on  the 
acceleration  estimate. 

Also,  note  that  just  before  the  commanded  acceleration  ceases  at  11  seconds,  the 
smoother  position  estimate  starts  to  lag  the  true  position.  This  is  due  to  the  fact  that 
the  smoother  is  “aware”  that  the  acceleration  will  soon  cease,  and  has  begun  to  react 
appropriately. 

Figures  C.4,  C.5  and  C.6  show  the  error  in  east  (inertial  y  axis)  target  acceleration 
and  position,  and  downward  (inertial  z  axis)  target  acceleration,  respectively,  for  Scenario 
3.  This  scenario  is  provided  to  show  that  smoothing  provides  considerable  robustness  to 
choices  of  target  acceleration  model  parameters.  Recall  from  Chapter  V  that  this  filter  is 
tuned  for  a  (“benign”)  target  acceleration  standard  deviation  of  one  g  (32  feet  per  second 
squared)  -  clearly,  the  filter  cannot  follow  the  6  g  target  in  Scenario  3  at  all  well,  but  the 
smoother  reduces  state  estimation  errors  dramatically.  This  performance  is  achieved  even 
though  this  target,  due  to  its  relatively  low  speed  and  high  g  turn,  actually  reverses  its 
direction  over  the  course  of  the  maneuver.  Thus,  it  appears  that  we  can  use  “benign”  filter 
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Figure  C.2.  Error  in  Target  Acceleration  Estimates  -  Inertial  Y  Axis,  Scenario  2 


i 


i 


1 


-i 


Figure  C.3.  Error  in  Target  Position  Estimates  -  Inertial  Y  Axis,  Scenario  2 
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Figure  C.4.  Error  in  Target  Acceleration  Estimates  -  Inertial  Y  Axis,  Scenario  3 
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Figure  C.5.  Error  in  Target  Position  Estimates  -  Inertial  Y  Axis,  Scenario  3 
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.  Error  in  Target  Acceleration  Estimates  -  Inertial  Z  Axis,  Scenario  3 


Figure  C.7.  Error  in  Target  Acceleration  Estimates  -  Inertial  Y  Axis,  Scenario  4 
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timing  parameters  and  thereby  “damp  out”  trajectory  estimate  error  due  to  measurement 
noise,  but  preserve  low  estimation  errors  for  highly  dynamic  targets  by  smoothing. 

Figures  C.7  and  C.8  show  the  error  in  east  (inertial  y  axis)  and  downward  (inertial 
z  axis)  target  acceleration,  respectively,  for  Scenario  4.  Fig.  C.7  shows  that  improvement 
in  the  state  estimate  is  gained  even  for  benign  maneuvers,  while  Fig.  C.8  demonstrates 
that  optimal  smoothing  does  not  provide  improvement  in  the  absence  of  an  unmodelled 
external  driving  force  (an  unmodelled  “Q”),  as  discussed  in  [154:13-14]. 

Figures  C.9  and  C.10  show  the  error  in  east  (inertial  y  axis)  and  downward  (inertial 
z  axis)  target  acceleration,  respectively,  for  Scenario  5.  Results  here  are  consistent  with 
previous  cases,  for  this  somewhat  different  trajectory. 

Finally,  to  observe  the  effect  of  a  reduced  lag  time,  Fig.  C.ll  repeats  the  scenario  of 
Fig.  C.9  with  a  one-second  fixed  lag  time,  instead  of  two  seconds.  Note  the  considerable 
degradation  in  the  accuracy  of  the  smoother-derived  acceleration  estimate.  Some  curves 
are  not  labeled  due  to  the  close  spacing,  but  the  general  effect  of  this  time  reduction  is  clear. 
The  curve  fit-derived  estimate  is  not  substantially  degraded  in  the  mean,  however  -  a  more 
dynamic  turn  would  have  produced  more  error  in  this  estimate.  Note  that  the  smoother 
estimate  now  lasts  for  an  additional  second,  since  the  smoother  cutoff  is  referenced  to  the 
same  real  time  point  as  in  the  previous  cases,  and  the  smoother  now  rims  until  one  second 
prior  to  that  time,  rather  than  two  seconds  prior,  as  in  the  previous  cases. 

C.7  Equations  for  Direction  Cosine  Matrix  (DCM)  Generation 

The  purpose  of  this  section  is  to  provide  the  reader  with  simple  procedures  for  gen¬ 
erating  direction  cosine  matrices  (DCM)  for  tracking  problems.  We  seek  to  avoid,  to  the 
extent  that  we  can,  the  laborious  process  of  defining  Euler  angles. 

C.7. 1  DCM  for  Sensor-to-Inertial  Frame  Transformation.  The  following  discus¬ 
sion  applies  equally  to  the  transformations  (1)  from  the  true  line-of-sight  or  sensor  frame 
to  the  inertial  frame,  and  the  inverse  transformation,  or  (2)  from  the  filter-predicted  line- 
of-sight  or  sensor  frame  to  the  inertial  frame,  and  its  inverse.  The  only  difference  is  that  in 
transformation  (1),  the  true  target  location  is  used,  while  in  transformation  (2),  the  latest 


C-26 


C-27 


Figure  C.10.  Error  in  Target  Acceleration  Estimates  -  Inertial  Z  Axis,  Scenario  5 


seconds 


Figure  C.ll.  Error  in  Target  Acceleration  Estimates  -  Inertial  Y  Axis,  Scenario  5  (One 
Second  Fixed  Lag) 
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best  filter  estimate  of  target  location  is  used.  The  first  set  of  transformations,  not  actually 
available  to  the  tracking  algorithm,  is  used  to  define  “truth”  measurement  values,  including 
noise  realizations.  The  second  set  of  transformations  is  used  to  define  the  Kalman  filter 
measurement  matrix  H,  expected  measurement  values,  and  so  on  -  these  transformations 
are  of  course  available  to  the  filter.  Note  that  this  discussion  assumes  perfect  knowledge 
of  sensor  or  “ownship”  inertial  position  and  orientation  with  respect  to  the  local  level.  If 
required,  one  can  further  define  true  and  filter- assumed  inertial  frames  (see  appropriate 
Fortran  code  in  [143]). 

Obtaining  the  direction  cosine  matrix  for  sensor-to-inertial  frame  transformation 
starts  by  defining  the  sensor-to-target  or  boresight  vector  iB  in  inertial  frame  coordinates. 
Dividing  this  vector  by  its  length  (range),  we  obtain  the  sensor-to-target  unit  vector,  iB. 

Now,  take  the  vector  cross  product  of  the  local  downward  or  inertial  unit  vector 
i2  ([0,0, 1]T,  in  inertial  coordinates)  with  the  sensor-to-target  unit  vector  (i*  X  iB).  The 
resulting  “elevation  axis”  vector  is  generally  of  non-unit  length,  and,  by  definition  of  the 
cross  product,  normal  to  the  two  vectors  crossed  to  produce  it.  Thus,  it  is  parallel  to 
the  local  horizontal.  Now  divide  the  elevation  axis  vector  by  its  length  to  normalize  it, 
producing  the  elevation  axis  unit  vector  iE  in  inertial  frame  coordinates. 

Next,  cross  the  sensor-to-target  unit  vector,  iB  into  the  elevation  axis  unit  vector  iB 
(ifl  x  ifi)  to  produce  the  azimuth  axis  unit  vector  i*  in  inertial  frame  coordinates.  Since 
the  two  unit  vectors  crossed  to  produce  were  normal  or  perpendicular  to  each  other, 
their  cross  product  is  automatically  of  unit  length  also,  and  does  not  require  subsequent 
normalization.  The  sensor  frame  is  now  defined  by  iB,  i^,  and  i^,  as  shown  schematically 
in  Fig.  4.1. 

Each  of  iB,  iB,  and  iyj  is  a  three-element  column  vector  of  inertial  frame  coordinates. 
Arranging  these  three  vectors  vertically  side-by-side  as  a  three  by  three  matrix,  we  imme¬ 
diately  have  the  direction  cosine  matrix  CJ,  for  converting  from  the  sensor  or  line-of-sight 
(subscript  Is)  frame  to  the  inertial  (superscript  i)  frame.  That  is,  multiplying  C},  times 
a  vector  coordinatized  in  the  sensor  frame  (vector  element  order:  boresight  axis  compo¬ 
nent,  elevation  axis  component,  azimuth  axis  component)  yields  a  vector  coordinatized  in 
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the  inertial  frame.  Finally,  since  Cj4  is  a  unitary  matrix,  its  inverse  C*'  is  equal  to  its 
transpose. 

C.  7.2  DCM  for  Target  Lift-to-Inertial  Frame  Transformation.  In  Chapter  V, 
we  discussed  definitions  for  the  target  velocity  and  target  body  coordinate  frames.  The 
purpose  of  this  section  is  to  expand  upon  that  discussion  and  consider  some  details  of  the 
author’s  implementation  and  use  of  coordinate  frames. 

In  Sect.  5.5.3. 1,  we  found  a  representation  for  the  velocity  frame  and  the  DCM 
transformation  C®  from  inertial  to  velocity  frames  using  the  classical  (Euler)  angles  of 
heading  and  flight  path  angle.  Following  the  discussion  on  the  sensor  frame  in  the  previous 
section,  the  reader  may  now  note  that  the  velocity  frame  is  defined  relative  to  the  velocity 
unit  vector  exactly  as  the  sensor  frame  is  defined  relative  to  the  boresight  vector.  This  fact 
presents  a  convenient  alternative  to  Euler  angles  for  obtaining  C’  -  i.e.,  through  vector 
cross  products  as  in  the  previous  section.  The  result  is  equivalent  element  for  element  to 
that  found  using  Euler  singles  as  shown  in  Eqn.  (5.4). 

As  a  useful  intermediate  between  the  velocity  and  body  coordinate  frames,  the  au¬ 
thor’s  simulations  occasionally  make  use  of  a  “lift”  frame.  This  lift  frame  (denoted  by 
subscript  L )  is  essentially  identical  to  the  “wind”  frame  (denoted  by  subscript  w)  as  de¬ 
fined  by  Etkin  [79],  distinguished  from  the  inertial  frame  by  heading  angle  77,  flight  path 
angle  7,  and  a  roll  single  p,  in  that  order.  The  author  desires  to  distinguish  between  Etkin’s 
wind  frsime  suid  the  lift  frame,  however,  since  the  last  Euler  single  (roll)  taking  the  inertial 
frsime  into  the  Etkin’s  wind  frsime  is  defined  by  a  vehicle  plane  of  symmetry,  while  in  our 
case,  this  single  is  defined  by  the  direction  of  the  aerodynsimic  normal  load  or  lift  vector. 
In  fact,  these  definitions  are  equivsdent  in  most  cases,  but  the  possibility  of  a  difference 
should  be  kept  in  mind. 

Definition  of  the  DCM  for  target  lift-to-inertial  frsime  trsmsformation  starts  by  defin¬ 
ing  C®  using  either  Euler  singles  or  vector  cross  products.  Depending  upon  the  application, 
target  state  values  may  be  actusd  (true)  or  estimated  qusmtities,  smd  simulation  code  must 
ensure  that  appropriate  values  sure  used. 
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Next,  define  the  target  acceleration  normal  to  and  along  the  velocity  vector.  The 
desired  inertial  acceleration  normal  to  the  velocity  vector  can  be  isolated  into  velocity 
frame  components  (1)  aVv  parallel  to  the  local  horizontal,  and  orthogonal  to  the  velocity 
vector,  and  (2)  aZv ,  generally  downward,  mutually  orthogonal  to  the  velocity  vector  and 
ay„.  The  specific  force  normal  to  the  velocity  vector  required  to  achieve  this  acceleration 
is  assumed  due  to  aerodynamic  lift,  and  can  be  isolated  into  components  (1)  ayv ,  (2)  aZv, 
and  (3)  g  cos(7),  parallel  to  aZv  and  counteracting  the  component  of  gravity  along  the  zv 
axis.  The  angle  p  is  then  calculated  as  shown  in  Fig.  5.24.  Note  that  lift  is  positive  in  the 
direction  opposite  to  the  lift  frame  zL  vector,  which  points  generally  downward. 

The  lift  frame  and  the  velocity  frame  are  distinguished  only  by  the  roll  angle  p  about 
the  velocity  vector.  Thus,  we  obtain  the  DCM  for  transformations  from  the  inertial  to  the 
lift  frame  by  premultiplying  C”  by  the  direction  cosine  matrix  C£,  which  is  found  as: 

1.0  0  0 

=  0  cos(p)  -  sin(p)  (C.31) 

0  sin(p)  cos  (p) 

The  usual  assumptions  for  aircraft  wind  and  body  frames  call  for  the  transformation 
from  wind  frame  to  body  frame  to  proceed  by  a  two-part  rotation:  first  about  the  zw  axis 
by  a  sideslip  angle  /3,  followed  by  an  angle-of-attack  rotation  a  about  the  (now  rotated) 
yw  axis  (nominally,  through  the  wings),  as  shown  in  Fig.  5.23  and  [79].  As  noted  in 
Sect.  5. 5.3.1,  however,  the  nominal  assumption  in  this  research  calls  for  a  zero  sideslip 
angle.  For  that  reason,  the  nominal  target  body  frame  is  generally  found  by  rotating  the 
lift  frame  directly  by  an  angle  of  attack  a  about  the  yL  axis  (the  lift  frame  axis  mutually 
orthogonal  to  the  lift  and  velocity  vectors  -  generally,  through  the  wings,  as  for  the  wind 
frame  yw  axis  [79]). 

For  the  development  in  Chapter  IV,  it  was  desirable  also  to  model  sideslip  angle,  but 
transformations  from  the  lift  frame  to  the  body  frame  involving  this  angle  were  modelled 
as  proceeding  in  the  order  of  angle  of  attack  first,  then  sideslip.  This  was  done  because 
angle  of  attack  is  the  major  variable  in  force  calculations,  and  the  author  did  not  desire 
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to  recompute  angle  of  attack  following  changes  in  sideslip  angle.  In  any  case,  for  small 
angular  displacements,  effects  from  these  two  rotations  about  orthogonal  axes  may  be 
approximated  as  independent  of  order,  as  motivated  in  [187:2.17]. 
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Peschon,  and  then  we  will  consider  modifications  that  might  be  desired  to  provide  more 
rigorous  answers,  if  the  basic  assumptions  of  this  development  do  not  apply. 


3.6.3  Relating  /* (x£,Jb)  to  Object  Class  Probability  p(u)i  |  Z[,  ZjJJ.  Recall 
that  the  “practical  version”  of  the  Larson  and  Peschon  method  as  implemented  on  dif¬ 
ferent  object  models  u \  would  find  the  XkB,  say  X££.,  for  each  uit  that  maximizes 
p(Xkn  |  Z{,  Z|5,,u>i),  not  by  maximizing  this  conditional  probability  directly,  but  rather 
by  maximizing  /’(xk,  lb),  where  (using  the  original  Larson  and  Peschon  form,  but  recalling 
that  in  our  development  for  object  recognition,  all  probabilities  would  be  conditioned  also 
on  Zdm  and  u>i): 

MAX  r(xi,  k)  =  MAX  {  MAX  [p(z{  |  xj)  p(x£  |  xJ.J  k  -  1)]  }  (3.10) 

*2  *2  *2-i 

or,  in  our  form: 

MAX  /*(*;  „,*  I  u>i)  = 

X2,n 

MAX  {  MAX  [p(z{  |  xj,n, <*)  p(xJB  |  x2_l  n,  Z^u,*)  V (xJ_1>B,  k  -  1  |  «*)] }  (3-H) 

1 

Maximizing  this  quantity  rather  than  the  conditional  probability  is  desirable  because 
we  avoid  having  to  compute  values  for  all  XJ  6  A“k,  which  we  would  have  to  do  to  find 
the  denominator  term  as  in  Eqn.  (3.6).  Examining  J*(xJ  „,4  |  w,)  closely,  note  that  the 
preceding  equation  is  equivalent  to: 

MAX  J*Kn>*  |  «*»j)  =  MAX  [p(XJ  n,  Zk  |  Zi>Wi)]  (3.12) 

X2,»  X.k,n 

Thus,  the  practical  version  of  the  Larson  and  Peschon  equations  gives  the  maximum 
value  of  p(XJ  B,  Z{  |  Z^,Wi),  and  the  state  history  estimate  Xk£.  for  a  given  a/*  which 
gives  that  maximum  joint  conditional  probability  density,  or,  equivalently,  since  there  is 
only  one  Z{ ,  that  state  history  which  maximizes  p(XJB  |  Z{,  ZJ{„ <*><). 

Suppose  on  the  other  hand  that  we  had  chosen  to  find  p(Xkfl,Zk,u\  |  ZJ|J  for  all 
XJ  n  E  X*k.  This  can  be  had  by  computing  p(XJ  „,  Z[  |  Z^,,  w,)  for  each  XJ  B  over  each  u>. 
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Peschon,  and  then  we  will  consider  modifications  that  might  be  desired  to  provide  more 
rigorous  answers,  if  the  basic  assumptions  of  this  development  do  not  apply. 


3.6.3  Relating  /*(xJ,Jfe)  to  Object  Class  Probability  p(ui  |  Z{,ZJj,).  Recall 
that  the  “practical  version”  of  the  Larson  and  Peschon  method  as  implemented  on  dif¬ 
ferent  object  models  would  find  the  X£nl  say  X££.,  for  each  that  maximizes 
p(x;,„  |  z'X,  u»j),  not  by  maximizing  this  conditional  probability  directly,  but  rather 
by  maximizing  I*(xjJ,  A),  where  (using  the  original  Larson  and  Peschon  form,  but  recalling 
that  in  our  development  for  object  recognition,  all  probabilities  would  be  conditioned  also 
on  Z^  and  «*): 

MAX  r(xi,t)=  MAX  {  MAX  [p(z{  I  *»)  p(x}  |  xj.,)  t  - 1)]  }  (3.10) 
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or,  in  our  form: 

max  *|<*)  = 
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Maximizing  this  quantity  rather  than  the  conditional  probability  is  desirable  because 
we  avoid  having  to  compute  values  for  all  XJjn  E  X*k,  which  we  would  have  to  do  to  find 
the  denominator  term  as  in  Eqn.  (3.6).  Examining  I *(xj  n,fc  j  w*)  closely,  note  that  the 
preceding  equation  is  equivalent  to: 

MAX  |  <*)  =  MAX  |  Zi,<*)]  (3.12) 
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Thus,  the  practical  version  of  the  Larson  and  Peschon  equations  gives  the  maximum 
value  of  p(XJ  n,  Z{  |  Z^,  Wj),  and  the  state  history  estimate  X^{  for  a  given  <*>*  which 
gives  that  maximum  joint  conditional  probability  density,  or,  equivalently,  since  there  is 
only  one  that  state  history  which  maximizes  p(XJn  |  Z{,Z 

Suppose  on  the  other  hand  that  we  had  chosen  to  find  p(XJ  B, Zk,Wi  |  ZjJ,)  for  all 
Xj  n  €  X“k.  This  can  be  had  by  computing  p(XJ  n,  Z{  |  ZjJ,,^)  for  each  XjJ„  over  each  u>i 
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